Git Product home page Git Product logo

Comments (24)

mwidman avatar mwidman commented on July 29, 2024

Oops, should note that during an attempt at debugging this I found that there were many tcp/ip connections left open in the "CLOSE_WAIT" state. I am not sure if this is the cause (or even related) but it did seem that when some of those finally closed, another thread would finally proceed.

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

In the following block from your code:

        try:
            logger.info("Thread %s: Checking existing logs from rackspace." % self.name)
            remote_file = container.get_object("%s\\%s.json" % (my_id, variable))
            logger.info("Thread %s: Received existing logs sized %d bytes from rackspace." % (self.name, remote_file.total_bytes))
            contents = remote_file.get()
            logger.info("Thread %s: Read contents from files" % (self.name))
        except:
            tries += 1
            logger.error("Thread %s: Error reading logs from rackspace. Retrying attempt %d of %d" % (self.name, tries, self.max_tries))

...can you change the except: statement to except Exception as e:, and then log the type of exception and its message?

Also, do you have eventlet installed? If so, does monkeypatching the socket module change anything?

import eventlet
eventlet.monkey_patch(socket=True)

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

I’ve just tried with your suggestion of using eventlet’s monkey_patch for socket and it “seemed” to reduce the wait time at the end compared with a base case but from experience the wait time is highly variable so I’m not sure it’s indicative of anything.

In regards to logging the Exceptions, I do get about 10-15 “No object with the name 'my_id.json' exists” out of ~400 ids, just due to the data that I am using (some of the ids don’t have log files on Rackspace yet).

I also tried using monkey_patch without the parameter (which as I understand it patches all libraries) but it also did not seem to have an effect.

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

OK, it was worth a shot. Since pyrax uses python-swiftclient for all connections to the API server, I don't have direct control over the way that the connections are handled.

Re: exceptions – I just wanted to make sure that nothing else was going on that your bare except was swallowing up.

Just a wild thought: what if you created a separate connection for each thread? IOW, move the lines right after authenticating into the thread's init:

class RackspaceReader(threading.Thread):
    def __init__(self, max_tries, processing_queue):
        threading.Thread.__init__(self)
        self.max_tries = max_tries
        self.processing_queue = processing_queue
        self.cfconn = pyrax.connect_to_cloudfiles(region=REGION, public=USE_PUBLIC)
        self.container = self.cfconn.create_container(HISTORY_CONTAINER)

and then use self.container in the other thread methods.

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

Yes, that was actually how I originally had it but when I saw all of the
tcp/ip connections open, I assumed I shouldn’t do that and went to one
shared connection (which seems to open similar number of tcip/ip
connections anyway). Again, I am not sure the outstanding tcip/ip
connections actually has an effect other than I do notice that after some
random period of time a chunk of them in the “CLOSE_WAIT” are cleared out
and the script starts running again (although not sure which is causing
which). The only other thing I’ve noticed in regard to that is that if I
tried to do the same thing of grabbing all of the cloudfile objects and
then their contents in a for-loop without threads it seems to only ever
open 1 tcp/ip connection.

I’ve been stumped for a while as to how to debug this further…

From: Ed Leafe [mailto:[email protected]]
Sent: May-13-13 3:52 PM
To: rackspace/pyrax
Cc: mwidman
Subject: Re: [pyrax] Pyrax CloudFiles extremely slow in threaded
application (#69)

OK, it was worth a shot. Since pyrax uses python-swiftclient for all
connections to the API server, I don't have direct control over the way
that the connections are handled.

Re: exceptions – I just wanted to make sure that nothing else was going on
that your bare except was swallowing up.

Just a wild thought: what if you created a separate connection for each
thread? IOW, move the lines right after authenticating into the thread's
init:

class RackspaceReader(threading.Thread):

def __init__(self, max_tries, processing_queue):

    threading.Thread.__init__(self)

    self.max_tries = max_tries

    self.processing_queue = processing_queue

    self.cfconn = pyrax.connect_to_cloudfiles(region=REGION,

public=USE_PUBLIC)

    self.container = self.cfconn.create_container(HISTORY_CONTAINER)

and then use self.container in the other thread methods.


Reply to this email directly or view it on
GitHubhttps://github.com//issues/69#issuecomment-17846407
.

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

After speaking with Rackspace support, it seems the issue is likely due to a piece of code in swiftclient that doubles a "backoff" time (used in a time.sleep() call) every time it retries to read/write. The only thing is, our logs showed time delays of 1023 in one case (and I am sure more in others) which implies 10 retries. I am not sure how that happens as swiftclient defaults to 5 and I do not see any code in pyrax that modifies that.

Ideally, swiftclient would have a "max backoff" available to prevent it from going beyond that point. For now, from the pyrax side, I was wondering if there is any exposed (or way to expose) this backoff time in Connection class so that the default can be set to some very small number instead of 1s?

from pyrax.

sivel avatar sivel commented on July 29, 2024

mwidman,

This is something I have been working on a for a while, and thus far, what I have been able to discern is that the httplib connection used by swiftclient is not thread safe. Even if you tried to use a shared connection, it fails, and ends up opening it's own new socket, which ends up creating further issues. Most people have seen speed improvements by instantiating new connections for each upload thread.

In my testing, and with code that I have written, I generally allow the number of threads to run to be specified, and split my files into multiple chunks of equal size to the threads. That would enable you to create, perhaps 10 shared connections, a shared connection per thread that then handles the upload of that threads segment of files.

I've been working with eventlet, to improve the functionality, and currently have it in a script not utilizing pyrax, but hope to be able to integrate a multi threaded uploader, downloader, and deleter back into pyrax if it eventually works out.

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

I have opened a bug on python-swiftclient to ask them to introduce a maximum backoff time. This doesn't fix the underlying "thread-safeness" issue but at least it will hopefully prevent 1hr delays where a script sits and does nothing. (It seems to me that if a script is requiring more than a couple of minutes of wait time between tries to be able to access Rackspace, there is something seriously wrong...)

The bug is listed here: https://bugs.launchpad.net/python-swiftclient/+bug/1183542

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

Just a status update of sorts on this one. I tried grabbing the swiftclient code and adding some prints to the Connection class' _retry.

            print "After sleep:"
            print "Number of tries: %s, Max number of tries: %s, Starting backoff: %s" % (self.attempts, self.retries, self.starting_backoff)
            print "Backoff now = %s" % backoff

When the Connection class is instantiated by Pyrax, the (max) retries are somehow set to a large number despite the fact that I could not find any code in pyrax that explicitly sets that parameter.

After that, the backoff variable itself somehow is being modified as I see output like:

After sleep:
Number of tries: 2, Max number of tries: , Starting backoff: 1
Backoff now = 64

So there may(?) be a bug in pyrax with regards to the number of retries but otherwise this seems like a python-swiftclient issue. So if anyone would like to vote for the previously linked bug to encourage them to fix it that would be greatly appreciated.

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

Upgraded python-swiftclient but still facing a similar issue

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

I didn't realize it at first, but that wild number for self.retries is your account number. I traced that to a bug in the way that the parameters were sent to the connection object when it is created. The fix is in the working branch for now.

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

Ed, after looking at your commit it makes sense now where the number was coming from. I also submitted a fix to swiftclient to instate a maximum backoff time of 64 seconds. The fix got in but not in time for 1.5.0 so anyone that wants to make use of it will need to pip install from the latest github commits.

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

@EdLeafe I am using the latest version for python-switfclient and the lastest version of pyrax (from git ) but the uploads get stuck after a while, I am running 8 python process to upload images in parallel

Here is the sample code

rack_conn = cloudfiles.get_connection("acount","password")
file_name = os.path.join(public_dir, image_path)
pic_container = rack_conn.create_container(data["cdn_containers"][container_type])

if os.path.isfile(file_name):
        pic_container.make_public()
        pic_container.upload_file(file_name, file_jpg)
        url = "%s/%s" % (pic_container.cdn_uri, file_jpg)
        print url

Am I supposed to close the connections after upload?

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

Sorry, I checked out from master will try with the working branch and see

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

I am facing the same issue even with the working branch, I am uncertain whether the bug is from the library

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

@kmarkiv Do you get any sort of error output? Do some processes complete while others do not? Does this get stuck at around the same time of day, or after running for a given amount of time?

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

It's related to the network I assume, the processes get stuck after 16 upload, whenever I run it..
They all are running simultaneously and we need to upload around 100 images at a time through a queue, The setup seems to work on another machine, This could be related to the server/python version

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

Experienced the same problem in python2.7.5 on ubuntu 10.04 but works on python2.7.1 on ubuntu 11.04,
Is there a way to forcefully close the connection after uploading? (that will solve my problem)

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

Yes:

pyrax.cloudfiles.connection.http_conn[1].close()

I should have a new release out later today that should fix some other performance issues; I'm not sure that it would solve your issue, but it's worth trying out.

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

@EdLeafe I need to set cors headers for the files, this was possible in python cloud files but unable to find the api for pyrax, I need the image(object uploaded to rackspace) to return the following header

"Access-Control-Allow-Origin": "*"

Thanks in advance

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

I need to do the same as the following answer http://stackoverflow.com/a/10425094

from pyrax.

EdLeafe avatar EdLeafe commented on July 29, 2024

@kmarkiv That is a great enhancement. I'm on vacation this week; could you please add an issue for this?

from pyrax.

kmarkiv avatar kmarkiv commented on July 29, 2024

@EdLeafe I opened an issue here
#16

Thanks a lot for your help and enjoy your vacation 👍

from pyrax.

mwidman avatar mwidman commented on July 29, 2024

I think the original problem referenced in this issue has now been fixed through the change in pyrax and swiftclient.

from pyrax.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.