Git Product home page Git Product logo

Comments (51)

patrickbajao avatar patrickbajao commented on August 16, 2024 2

Hi,

Will there be a 4.0.12 release? I can see that the fix was already backported in the PHP5 branch but I don't know when it'll be released.

from apcu.

jr997 avatar jr997 commented on August 16, 2024 1

I've experienced the same issue, but found the cause (at least in my case)

At one point I called apc_cache_info(), at a time where the result of that call was too big to fit in PHPs memory_limit. This gave me a "500 Internal Server Error", which seems acceptable, BUT it failed to release the lock. Therefor subsequent calls, in that or other apache child processes, all waited for a lock, that was never going to be released.

This does seem like a bug in my humble opinion: the system should never end up in this state, no matter what flaw the PHP programmer might introduce.

One thing I'd like to suggest, is that the apc_cache_info() returns something like FALSE if the memory_limit prevents the APC system from returning the requested data.

ps. to whoever needs a quick-fix: I simply changed the memory_limit in the script that called apc_cache_info(), and I haven't seen the problem since.

I use 4.0.10

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Could we know a bit about the software you are running, or anything else about your setup that might help to reproduce the same behaviour ...

frame appears significant, but it's difficult to guess what has caused it ...

from apcu.

weltling avatar weltling commented on August 16, 2024

@anoakie ping ... any update on this, a repro script maybe?

from apcu.

anoakie avatar anoakie commented on August 16, 2024

We tried with both PHP 5.5 RC3 and PHP 5.5.0 final. It's quite hard to reproduce. It happens to random servers in our apache cluster during heavy load every few days.
We're running Ubuntu's apache 2.2.22 prefork. We build a custom PHP 5.5.0 package.
We've reverted to PHP 5.3 + APC for now. I'm currently on sabbatical, so I'm not able to look into this further.
I'm adding my coworkers to this thread so they hopefully expand on this while I'm gone.

from apcu.

denji avatar denji commented on August 16, 2024

@anoakie alternative https://github.com/laruence/yac

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

@anoakie it's been a while ... I got to assume you still have the error (I think others have mentioned the same thing) ... without a way to reproduce I'm not able to debug ... it may be that your operating systems default rwlock is not suitable in some way (which we'll eventually find and fix when we can reproduce reliably and determine that to be definite cause) ... the only workaround I can suggest to get things going is to disable the use of rwlocks with --disable-apcu-rwlocks on configure for apcu ...

from apcu.

magicmonkey avatar magicmonkey commented on August 16, 2024

We're experiencing what I think is this problem as well - all our Apache children hang, and a GDB trace shows this:

#0  0x00002b50542905f0 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
#1  0x00002b505faf253d in apc_lock_wlock (lock=0x2b5071483088) at /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_lock.c:252
#2  0x00002b505faf61a5 in apc_cache_insert (cache=0x2b5071449d60, key=..., value=0x2b5071943a38, ctxt=0x7fffeae5fc60, t=1381421665, exclusive=0 '\000')
    at /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_cache.c:841
#3  0x00002b505faf64a5 in apc_cache_store (cache=0x2b5071449d60, 
    strkey=0x2b5067d86480 "/var/lamp/code/v-2013-10-r63/application-apps--/etc/environment.ini-config", keylen=75, val=0x2b5052a11250, ttl=3600, 
    exclusive=0 '\000') at /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_cache.c:440
#4  0x00002b505faf3bff in apc_store_helper (ht=<value optimized out>, return_value=0x2b5067c137c0, return_value_ptr=<value optimized out>, 
    this_ptr=<value optimized out>, return_value_used=<value optimized out>, exclusive=0 '\000')
    at /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/php_apc.c:659
#5  0x00002b505d1aee59 in zend_do_fcall_common_helper_SPEC (execute_data=0x2b50527e5578) at /usr/src/redhat/BUILD/php-5.5.3/Zend/zend_vm_execute.h:543
#6  0x00002b505d1ed7f8 in execute_ex (execute_data=0x2b50527e5578) at /usr/src/redhat/BUILD/php-5.5.3/Zend/zend_vm_execute.h:356
#7  0x00002b505d17b7da in zend_execute_scripts (type=8, retval=0x0, file_count=3) at /usr/src/redhat/BUILD/php-5.5.3/Zend/zend.c:1316
#8  0x00002b505d11c699 in php_execute_script (primary_file=0x7fffeae62340) at /usr/src/redhat/BUILD/php-5.5.3/main/main.c:2484
#9  0x00002b505d22828d in php_handler (r=0x2b509186a968) at /usr/src/redhat/BUILD/php-5.5.3/sapi/apache2handler/sapi_apache2.c:667
#10 0x00002b50527a5e8a in ap_run_handler ()
#11 0x00002b50527a9318 in ap_invoke_handler ()
#12 0x00002b50527b3c7a in ap_internal_redirect ()
#13 0x00002b505b5f7c70 in ap_make_dirstr_parent () from /etc/httpd/modules/mod_rewrite.so
#14 0x00002b50527a5e8a in ap_run_handler ()
#15 0x00002b50527a9318 in ap_invoke_handler ()
#16 0x00002b50527b3e28 in ap_process_request ()
#17 0x00002b50527b1010 in ?? ()
#18 0x00002b50527ad112 in ap_run_process_connection ()
#19 0x00002b50527b82c9 in ?? ()
#20 0x00002b50527b855a in ?? ()
#21 0x00002b50527b8dbd in ap_mpm_run ()
#22 0x00002b5052792fd8 in main ()

This is on a not-particularly-busy server. There are a couple of places in our code where we write to APCu, and various Apache children are stuck on writing various different keys (ie they're at different points in our code but all are hung when writing to APCu)

I'm guessing something has obtained the futex and not released it for some reason, but I'm not sure how to find out what.

Here's the APC bit from "php -i" (I'm pretty sure we're using the same config in Apache as on the CLI):

apc

APC support => Emulated

apcu

APCu Support => enabled
Version => 4.0.2
APCu Debugging => Disabled
MMAP Support => Enabled
MMAP File Mask => /tmp/apc.Zk0HWP
Serialization Support => php, eval
Revision => $Revision: 328290 $
Build Date => Oct  9 2013 10:17:53

Directive => Local Value => Master Value
apc.coredump_unmap => Off => Off
apc.enable_cli => On => On
apc.enabled => On => On
apc.entries_hint => 8192 => 8192
apc.gc_ttl => 3600 => 3600
apc.mmap_file_mask => /tmp/apc.Zk0HWP => /tmp/apc.Zk0HWP
apc.preload_path => no value => no value
apc.rfc1867 => Off => Off
apc.rfc1867_freq => 0 => 0
apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS
apc.rfc1867_prefix => upload_ => upload_
apc.rfc1867_ttl => 3600 => 3600
apc.serializer => php => php
apc.shm_segments => 1 => 1
apc.shm_size => 512M => 512M
apc.slam_defense => On => On
apc.smart => 0 => 0
apc.ttl => 7200 => 7200
apc.use_request_time => On => On
apc.writable => /tmp => /tmp

from apcu.

magicmonkey avatar magicmonkey commented on August 16, 2024
(gdb) frame 1
#1  0x00002b505faf253d in apc_lock_wlock (lock=0x2b5071483088) at /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_lock.c:252
252 /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_lock.c: No such file or directory.
    in /usr/src/redhat/BUILD/php-pecl-apcu-4.0.2/apc_lock.c
(gdb) p *lock
$1 = {__data = {__lock = 0, __nr_readers = 4294967293, __readers_wakeup = 6, __writer_wakeup = 5, __nr_readers_queued = 0, __nr_writers_queued = 50, 
    __writer = 0, __shared = 128, __pad1 = 0, __pad2 = 0, __flags = 0}, 
  __size = "\000\000\000\000\375\377\377\377\006\000\000\000\005\000\000\000\000\000\000\000\062\000\000\000\000\000\000\000\200", '\000' <repeats 26 times>, __align = -12884901888}

from apcu.

rathers avatar rathers commented on August 16, 2024

I believe I have managed to reproduce this issue with a test script and some load injection. I've uploaded the test script and instructions here:

https://github.com/rathers/apcu-repro

Essentially the test script does a load of apc_store and apc_fetch calls against the same key. Loading this at increasing rates eventually triggers the number of apache children to spiral out of control and lock up.

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

@rathers Thanks for your effort ... got some time this evening to have a look, think we got it ... can I get some feedback ??

from apcu.

glenjamin avatar glenjamin commented on August 16, 2024

Hi, I'm trying to get a handle on what has changed in that diff to reason about how the fix works.

As far as I can tell, it:

  • removes eval serialiser (unrelated)
  • ensures apache cannot kill a thread while it has a lock
  • checks the state of cache-busy more often

Under what circumstances is the cache considered "busy"?

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

The eval serializer is unrelated, it was out in the wild for testing, and caused more harm than good ... so it's gotta go ...

The cache is considered busy during gc, which is invoked implicitly throughout, sometimes by apc itself, sometimes by the allocator underneath ... allowing php to call a function that operates on a busy cache is futile ...

The thing that fixes the problem with apache is utilizing [un]block interruptions on locks, just like APC did ...

from apcu.

rathers avatar rathers commented on August 16, 2024

@krakjoe, thanks for the commit. We've rebuilt apcu including this commit and deployed it to a test box. It hasn't made any difference :( We're still seeing the same locking problem, reproducible in exactly the same way as before. Any further ideas to try out?

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Damn ... I have that pulse you get under your eye when things get a bit stressful ...

I'm a bit baffled, let me have some thinking time ...

I'm not able to make apache spiral out of control anymore, the number of processes remain steady and apache remains responsive but slow ...

from apcu.

rathers avatar rathers commented on August 16, 2024

What version of apache are you using? We're still on 2.2 which may or may not make a difference!

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024
[joe@fiji php-src]$ httpd-nts -v
Server version: Apache/2.2.23 (Unix)
Server built:   Mar 15 2013 10:39:59
[joe@fiji php-src]$ httpd-zts -v
Server version: Apache/2.2.23 (Unix)
Server built:   Sep 16 2013 09:55:35

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

try now ??

bit of a mistake on my part there, and an omission ... this has got to work ...

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

here is the options I am running http_load with:

./http_load -parallel 100 -rate 10 -seconds 100 apc-bench

I've run it several times (lots and lots and still am) like that without a problem ... is that enough ??

from apcu.

rathers avatar rathers commented on August 16, 2024

That looks a lot higher than I've managed to achieve. How beefy is your box? Try it without -parallel as I'm not sure which would take precedence.

We're just redeploying APCu with your latest patch then will try again

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

8 (4+HT) cores @ 3.4ghz with 16GB DDR3 ... when I run it more agressively (x10) the amount of processes spawned to handle requests does shoot up, but they are all, eventually, shutdown gracefully and so cannot be blocking waiting to acquire a mutex ...

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

@rathers I can't stand the suspense ??

from apcu.

rathers avatar rathers commented on August 16, 2024

I'm starting to get very confused. I think there may be two things at play here:

1 - apache lockups
2 - apache procs exploding in number

(2) is still happenning, even on the latest build.
(1) Seems to be happening sometimes depending on the ini settings!

It is possible we only ever encountered (1) because (2) can occur, I'm really not sure.

If we set apc.ttl and apc.gc_ttl to 0 (as opposed to some number) then (1) doesn't happen. The apache procs will recover and reduce in number if we remove the load.

I've been trying to think what would cause (2) and wonder if there is some hard limit in APC/PHP/Apache/Linux somewhere that limits how many locks can be processed in a second. Do you think the kernel could have anything to do with it? Our test boxes are fairly old (CentOS 5.9) with a 2.6.18 kernel.

WDYT?

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

This is APC:

[joe@fiji http_load-12mar2006]$ ./http_load -parallel 1000 -rate 100 -seconds 10 apc-bench
21 fetches, 979 max parallel, 147 bytes, in 10.0003 seconds
7 mean bytes/connection
2.09993 fetches/sec, 14.6995 bytes/sec
msecs/connect: 0.132667 mean, 0.224 max, 0.086 min
msecs/first-response: 3733.66 mean, 9703.39 max, 1079.31 min
HTTP response codes:
  code 200 -- 21

This is APCu:

[joe@fiji http_load-12mar2006]$ ./http_load -parallel 1000 -rate 100 -seconds 10 apc-bench
161 fetches, 839 max parallel, 4.00724e+07 bytes, in 10.0004 seconds
248897 mean bytes/connection
16.0994 fetches/sec, 4.0071e+06 bytes/sec
msecs/connect: 0.132323 mean, 0.216 max, 0.082 min
msecs/first-response: 1676.21 mean, 3298.75 max, 3.642 min
HTTP response codes:
  code 200 -- 161

They both behave with regard to apache processes in the same way now, lots are created but eventually shutdown.

I don't think there is a bug present anymore, tried comparing behaviour with normal APC ??

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

This is APCu now:

[joe@fiji http_load-12mar2006]$ ./http_load -parallel 1000 -rate 100 -seconds 10 apc-bench
173 fetches, 827 max parallel, 4.30592e+07 bytes, in 10.0005 seconds
248897 mean bytes/connection
17.2991 fetches/sec, 4.3057e+06 bytes/sec
msecs/connect: 0.121665 mean, 0.231 max, 0.071 min
msecs/first-response: 1792.06 mean, 3494.57 max, 3.738 min
HTTP response codes:
  code 200 -- 173

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

something does smell a bit fishy about that result ... I'm still looking, haven't given up ...

from apcu.

rathers avatar rathers commented on August 16, 2024

When you're comparing APC and APCu are they using different PHP versions?

from apcu.

rathers avatar rathers commented on August 16, 2024

The apache process explosion thing is quite subtle and hard to explain but using -rate 100 wont reveal it as it's too heavyweight. I found there was an implicit maximum rate that could be sustained (in our case around -rate 3 or 4) with apache beahving perfectly normal with a just a few busy procs and response times constant at around 300ms. Then just by injecting a few manual requests with curl (and i really do mean a few, like 3 or 4!) the whole thing then explodes. Response times increase into multiple seconds and beyond and apache spawns procs until it hits MaxClients. This just from a few extra requests! Never seen apache behave like that before.

Depending on the setup (APCu version, compile settings, ini settings etc) the apache procs may or may not recover

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

I have adjusted a few things in the last few minutes, we should be completely consistent with apc now ... I get bad byte count messages though under extreme load ...

I am using same version, their timings are about now equal as we should expect them to be ...

I'm a little/lot tired, I need thinking time, I'll have to come back to it tomorrow ...

Thanks for your patience ... you're going to need some more :)

from apcu.

fauvel avatar fauvel commented on August 16, 2024

We are experiencing the same issue and could reproduce it with a very simple CLI script – so it's not directly Apache related!

<?php

$text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';

$i = 0;
while (true) {
    echo "\r" . ++$i;
    apc_add((string) $i, $text);
}

The script hangs up after only 13'248 iterations if apc.shm_size has the "magical" value 16M. It runs through (so to say) with 15M!

The good thing is that you can reproduce it on our vagrant box:

git clone [email protected]:cargomedia/CM.git
cd CM
vagrant up
vagrant ssh # from now on in the vagrant box
sudo apt-get install emacs
sudo emacs /etc/php5/cli/conf.d/20-apcu.ini # set apc.shm_size to 16M
emacs apc.php # copy-paste the above script
php apc.php # hangs up

To get the backtrace in the vagrant box, run:

sudo apt-get install gdb
gdb # from now on in the gdb console
file /usr/bin/php5
run apc.php
^C # Interrupt the process hanging up
backtrace full

The relevant part of the backtrace is:

#0  0x00007ffff4d1cabd in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
No symbol table info available.
#1  0x00007ffff3b823b4 in apc_lock_wlock (lock=<optimized out>) at /tmp/91cdb7ca50775aa4c7a277127669e38b/apcu-4.0.4/apc_lock.c:252
No locals.
#2  0x00007ffff3b86cd8 in apc_cache_insert (cache=cache@entry=0xfe76f0, key=..., value=0x7fffe9222438, ctxt=ctxt@entry=0x7fffffffaac0, t=t@entry=1397131216, exclusive=exclusive@entry=1 '\001')
    at /tmp/91cdb7ca50775aa4c7a277127669e38b/apcu-4.0.4/apc_cache.c:736
No locals.
#3  0x00007ffff3b87333 in apc_cache_store (cache=0xfe76f0, strkey=0x7ffff7e72410 "13248", keylen=6, val=0x7ffff7e70960, ttl=0, exclusive=exclusive@entry=1 '\001')
    at /tmp/91cdb7ca50775aa4c7a277127669e38b/apcu-4.0.4/apc_cache.c:335
        entry = <optimized out>
        key = {str = 0x7ffff7e72410 "13248", len = 6, h = 6951369112743, mtime = 1397131216, owner = -21640}
        t = 1397131216
        ctxt = {pool = 0x7fffe92223a8, copy = APC_COPY_IN, force_update = 0, copied = {nTableSize = 0, nTableMask = 0, nNumOfElements = 0, nNextFreeElement = 0, pInternalPointer = 0x0, 
            pListHead = 0x0, pListTail = 0x0, arBuckets = 0x0, pDestructor = 0, persistent = 0 '\000', nApplyCount = 0 '\000', bApplyProtection = 0 '\000'}, serializer = 0x7ffff3d921a0, 
          key = 0x7fffffffaa90}
        ret = 0 '\000'
#4  0x00007ffff3b8409b in apc_store_helper (ht=<optimized out>, return_value=0x7ffff7e72780, exclusive=<optimized out>, 
    return_value_used=<error reading variable: Unhandled dwarf expression opcode 0xfa>, this_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>, 
    return_value_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>) at /tmp/91cdb7ca50775aa4c7a277127669e38b/apcu-4.0.4/php_apc.c:662
        key = 0x7ffff7e70900
        val = 0x7ffff7e70960
        ttl = 0

I hope this can help figuring out what's happening!

from apcu.

bogdan-plevit avatar bogdan-plevit commented on August 16, 2024

Hello,

We are experiencing the same issue described by rathers. In our setup we are using Nginx (version: nginx/1.7.3 ) and PHP-FPM (version 5.5.18). The APCu (version: 4.0.6 Revision: 328290) extension is loaded in PHP-FPM. This setup is used for now for our development infrastructure.

We have replicated this issue using the steps described here: https://github.com/rathers/apcu-repro. In our development setup we are using PHP-FPM with multiple pools and we have managed to lock up only the pool who is processing the requests from the http_load tool. The other pools are processing requests fine until they hit the same issue.

If we run http_load at a relatively low rate (10) for a couple of times, we are experiencing also an increase in response times when we run: time curl http://localhost/apc-bench.php.

We are planning to migrate our production environment (which also uses Nginx) from PHP-FPM (version 5.3.29) and APC (version 3.1.13) and we would like to use APCu as a drop-in replacement.

We are willing to provide any information or do debugging on the development environment.

Here is the output of php -i|grep -iE "^apc":
apc
APC support => Emulated
apcu
APCu Support => Disabled
APCu Debugging => Disabled
apc.coredump_unmap => Off => Off
apc.enable_cli => Off => Off
apc.enabled => On => On
apc.entries_hint => 4096 => 4096
apc.gc_ttl => 3600 => 3600
apc.mmap_file_mask => no value => no value
apc.preload_path => no value => no value
apc.rfc1867 => Off => Off
apc.rfc1867_freq => 0 => 0
apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS
apc.rfc1867_prefix => upload_ => upload_
apc.rfc1867_ttl => 3600 => 3600
apc.serializer => php => php
apc.shm_segments => 1 => 1
apc.shm_size => 2048M => 2048M
apc.slam_defense => On => On
apc.smart => 0 => 0
apc.ttl => 1800 => 1800
apc.use_request_time => On => On
apc.writable => /tmp => /tmp

from apcu.

 avatar commented on August 16, 2024

@bogdan-plevit @krakjoe

Hi guys,

I've done some additional testing with @bogdan-plevit and compiled apcu with
"--disable-apcu-mmap", and now the apc works better it still causes memory leakage but lower fragementation. While i run the http_load test i can now acces the apc info page and the curl works better. From what i see this only a workaround, the main issue is still at large πŸ‘ .
If you need the steps to reproduce let me know. I will do some additional testing and hopefully see the problem.

from apcu.

denji avatar denji commented on August 16, 2024
[apc_cache.c:1220]: (error) va_list 'args' used before va_start() was called.
[apc_mmap.c:77] -> [apc_mmap.c:67]: (warning) Possible null pointer dereference: file_mask - otherwise it is redundant to check it against null.
[apc_mmap.c:86] -> [apc_mmap.c:67]: (warning) Possible null pointer dereference: file_mask - otherwise it is redundant to check it against null.
[apc_shm.c:58]: (warning) Return value of function strerror() is not used.
[php_apc.c:540] -> [php_apc.c:534]: (warning) Possible null pointer dereference: info - otherwise it is redundant to check it against null.

apcu/apc_mmap.c:97 (warning) Call to function 'mktemp' is insecure as it always creates or uses insecure temporary file.  Use 'mkstemp' instead
apcu/apc_pool.c:291 (warning) Value stored to 'redzone' is never read
apcu/apc_pool.c:252 (warning) Value stored to 'redsize' is never read

from apcu.

fauvel avatar fauvel commented on August 16, 2024

Thank you guys for working on this issue, I would really appreciate any improvement of APCu stability at this point!

from apcu.

jerrylindahl avatar jerrylindahl commented on August 16, 2024

I seem to have encountered the same issue. On ubuntu 14.04 with Apache 2.4.7 and php5.5.9 with APCu4.0.2 (also tested with APCu 4.0.6 from utopic) apache will stop responding at a random point while running our test suite.

Apache will stop responding and running strace shows processes waiting in a futex call.

Not able to reproduce with the script provided by @fauvel

Could this issue be related to this? #85

We haven't seen the issue in prod where we run an old oracle linux with Apache 2.4.10, php5.5.20 and APCu 4.0.7.

from apcu.

adaladam avatar adaladam commented on August 16, 2024

I had the same issue with owncloud 8.0.0. APCu hung apache. This issue was solved by updating APCu to 4.0.6 version. See also this thread https://bugs.launchpad.net/ubuntu/+source/php-apcu/+bug/1422484, this might help you.

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

It's reporting that stability improved in later versions, so closing the bug ... if I'm wrong, open a new bug.

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Please try with the latest pecl release and report back ?

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Bump, can you try with 4.0.10 or 5.1.2 please ?

from apcu.

masterdead avatar masterdead commented on August 16, 2024

Still persist

#0  0x00007f5da2862b2d in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f5da1a38359 in apc_lock_wlock (lock=<optimized out>) at /usr/src/apcu/apcu-4.0.10/apc_lock.c:245
#2  0x00007f5da1a3cbe8 in apc_cache_insert (cache=cache@entry=0x2d73120, key=..., value=0x7f5d95412ce8, ctxt=ctxt@entry=0x7ffe99094b80, t=t@entry=1452024027, exclusive=exclusive@entry=0 '\000')
    at /usr/src/apcu/apcu-4.0.10/apc_cache.c:748
#3  0x00007f5da1a3d243 in apc_cache_store (cache=0x2d73120, strkey=0x7f5d899de0f8 "core-init-cache-fantasy", keylen=40, val=0x7f5d899130c8, ttl=59, exclusive=exclusive@entry=0 '\000')
    at /usr/src/apcu/apcu-4.0.10/apc_cache.c:347
#4  0x00007f5da1a39fab in apc_store_helper (ht=<optimized out>, return_value=0x7f5d899e4288, exclusive=<optimized out>, return_value_used=<error reading variable: Unhandled dwarf expression opcode 0xfa>, 
    this_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>, return_value_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>) at /usr/src/apcu/apcu-4.0.10/php_apc.c:662
#5  0x00000000006e86b9 in dtrace_execute_internal ()
#6  0x000000000079dd01 in ?? ()
#7  0x0000000000762e58 in execute_ex ()
#8  0x00000000006e858d in dtrace_execute_ex ()
#9  0x000000000079e338 in ?? ()
#10 0x0000000000762e58 in execute_ex ()
#11 0x00000000006e858d in dtrace_execute_ex ()
#12 0x000000000079e338 in ?? ()
#13 0x0000000000762e58 in execute_ex ()
#14 0x00000000006e858d in dtrace_execute_ex ()
#15 0x000000000079e338 in ?? ()
#16 0x0000000000762e58 in execute_ex ()
#17 0x00000000006e858d in dtrace_execute_ex ()
#18 0x000000000079e338 in ?? ()
#19 0x0000000000762e58 in execute_ex ()
#20 0x00000000006e858d in dtrace_execute_ex ()
#21 0x000000000079fb02 in ?? ()
#22 0x0000000000762e58 in execute_ex ()
#23 0x00000000006e858d in dtrace_execute_ex ()
#24 0x00000000006faee8 in zend_execute_scripts ()
#25 0x00000000006963d2 in php_execute_script ()
#26 0x0000000000473b32 in main ()
(gdb) frame 0
#0  0x00007f143d000b2d in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) frame 1
#1  0x00007f143c1d6359 in apc_lock_wlock (lock=<optimized out>) at /usr/src/apcu/apcu-4.0.10/apc_lock.c:245
245 /usr/src/apcu/apcu-4.0.10/apc_lock.c: No such file or directory.
(gdb) frame 2
#2  0x00007f143c1dabe8 in apc_cache_insert (cache=cache@entry=0x29d0120, key=..., value=0x7f142c94f1f8, ctxt=ctxt@entry=0x7ffd4e182b20, t=t@entry=1452028220, exclusive=exclusive@entry=0 '\000')
    at /usr/src/apcu/apcu-4.0.10/apc_cache.c:748
748 /usr/src/apcu/apcu-4.0.10/apc_cache.c: No such file or directory.
(gdb) frame 3
#3  0x00007f143c1db243 in apc_cache_store (cache=0x29d0120, strkey=0x7f142417a418 "hobby-blesk-cz-magsets-props-103", keylen=33, val=0x7f1424192430, ttl=237, exclusive=exclusive@entry=0 '\000')
    at /usr/src/apcu/apcu-4.0.10/apc_cache.c:347
347 in /usr/src/apcu/apcu-4.0.10/apc_cache.c
(gdb) frame 4
#4  0x00007f143c1d7fab in apc_store_helper (ht=<optimized out>, return_value=0x7f1424180d80, exclusive=<optimized out>, return_value_used=<error reading variable: Unhandled dwarf expression opcode 0xfa>, 
    this_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>, return_value_ptr=<error reading variable: Unhandled dwarf expression opcode 0xfa>) at /usr/src/apcu/apcu-4.0.10/php_apc.c:662
662 /usr/src/apcu/apcu-4.0.10/php_apc.c: No such file or directory.

from apcu.

mattsmith0308 avatar mattsmith0308 commented on August 16, 2024

Has anyone looked into this issue recently? I was able to reproduce this in the same manner @jr997 mentioned using 4.0.10. It looks like a fix would be as simple as allocating all necessary memory for the cache_info function prior to taking this lock https://github.com/krakjoe/apcu/blob/PHP5/apc_cache.c#L1539.

from apcu.

muxx avatar muxx commented on August 16, 2024

The same problem on php7.0.5 and apcu 5.1.2

from apcu.

Zlender avatar Zlender commented on August 16, 2024

@muxx do you have a script that would show how your reproduce it with php7. I was trying to do it the way it locks with php5.6 but were not able to.

from apcu.

muxx avatar muxx commented on August 16, 2024

@Zlender Unfortunally I don't have script which can reproduce this case :( Problem appears on the workload at ~20-30 rps and can trigger after hour or after 3 hours of working or after 5 hours. And 20-30 rps prevent to understand what particular request causes the problem.

from apcu.

Zlender avatar Zlender commented on August 16, 2024

Instructions on how to reproduce this with PHP 5.6 https://github.com/Zlender/apcu_19 similar way also works on latest PHP 7 and apcu 5

from apcu.

sarumpaet avatar sarumpaet commented on August 16, 2024

@krakjoe I think that fix is too narrow? Similar code is in apc_cache.c/apc_cache_stat (APC_RLOCK, then array_init()) and several places in apc_iterator.c. Shouldn't basically all parts after any APC_RLOCK (and possibly APC_LOCK, too) be guarded by zend_try?

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

@sarumpaet I think you are right, after a quick review I've added some more tries ...

I'm looking for feedback now on stability ... hopefully if nothing bad happens @remicollet can do a release in the coming days ...

from apcu.

remicollet avatar remicollet commented on August 16, 2024

hopefully if nothing bad happens @remicollet can do a release in the coming days ...

No problem, just ping me when ready.

from apcu.

sarumpaet avatar sarumpaet commented on August 16, 2024

I don't know how the zend_try mechanism works exactly - is it possible that some of the functions return ill defined values now in the out of memory case? E.g., zend_try { ... array_init(stat); ... } zend_end_try(); return stat; looks suspicious, possibly same for some boolean returns.

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Fixed that on last commit, forgot to tag issue ...

from apcu.

krakjoe avatar krakjoe commented on August 16, 2024

Crap ...

from apcu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.