Git Product home page Git Product logo

Comments (4)

tbarbette avatar tbarbette commented on August 19, 2024

I have more info about this problem (I work with @devmusings).

It works fine if I use the last ixgbe driver from intel at http://sourceforge.net/projects/e1000/files/ixgbe%20stable/3.22.3/ .
We run the standard debian kernel 3.16 and looking at dmesg the current driver is 3.19.1-k. A 3.19.1 variant. But the difference between the kernel ixgbe and the intel one is not clear...?
The kernel 3.18 version seems to also have that same driver version, so I didn't test it and should probably bug too...

The problem is always in ixgbe_xmit_frame_ring.

So to conclude, the problem is ixgbe-related, and is corrected in the last version from intel, but is still in the most recent kernel. Maybe a bug report should be reported as it could come from one of the modification in the kernel version of the driver, but it is quite hard to explain the problem as I couldn't really find the source of the problem. For now if someone has the same problem, just use intel's driver...

Also, it appears only with some packet generators. Using a loop configuration with Netmap, all goes fine, but using a Tilera to generate packets it fails. The generated packets are the same, generated with quite the same program...

Here is the kernel last messages recovered with kdump. The two first lines are the output of the last packet passing through Print() and then a click_chatter I added. The last packet seems fine.

[ 1838.399421] chatter: 60 | 90e2ba46 f2e067c6 697351ff 08004510 002e0000 40004011
[ 1838.401131] chatter: Packet 0xffff8803d2821500, length 60, txq 0xffff8800369d0000 (dev = 0xffff8800369c0000, state = 0), netdev 0xffff8800369c0000 (name = eth2, state = 3, id=0, port=0)
[ 1838.404559] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
[ 1838.406268] IP: [] ixgbe_xmit_frame_ring+0x79/0xc70 [ixgbe]
[ 1838.407955] PGD 0
[ 1838.409610] Oops: 0000 [#1] SMP
[ 1838.411264] Modules linked in: click(O) proclikefs(O) ipt_REJECT xt_LOG xt_limit xt_multiport iptable_filter ip_tables x_tables bnep bluetooth 6lowpan_iphc binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc nls_utf8 nls_cp437 vfat fat fuse snd_hda_codec_hdmi joydev x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_hda_codec_realtek crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel aesni_intel eeepc_wmi aes_x86_64 asus_wmi lrw sparse_keymap gf128mul snd_hda_intel rfkill glue_helper ablk_helper cryptd video nvidia(PO) snd_hda_controller iTCO_wdt psmouse iTCO_vendor_support pcspkr serio_raw mxm_wmi snd_hda_codec evdev snd_hwdep sb_edac edac_core snd_pcm lpc_ich snd_timer snd mfd_core i2c_i801 soundcore drm shpchp tpm_infineon tpm_tis processor tpm wmi thermal_sys
[ 1838.419652] mei_me mei button ext4 crc16 mbcache jbd2 dm_mod raid0 hid_generic usbhid hid md_mod sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ehci_pci libata xhci_hcd ehci_hcd igb firewire_ohci i2c_algo_bit scsi_mod firewire_core usbcore crc_itu_t i2c_core ixgbe usb_common dca ptp pps_core mdio
[ 1838.424613] CPU: 0 PID: 12072 Comm: kclick Tainted: P O 3.16.0-4-amd64 #1 Debian 3.16.7-ckt2-1
[ 1838.426230] Hardware name: PRIMINFO UNLOCK INSTALL/P9X79-E WS, BIOS 1501 01/15/2014
[ 1838.427832] task: ffff88041d744ca0 ti: ffff8803cd37c000 task.ti: ffff8803cd37c000
[ 1838.429380] RIP: 0010:[] [] ixgbe_xmit_frame_ring+0x79/0xc70 [ixgbe]
[ 1838.430932] RSP: 0018:ffff8803cd37fcf0 EFLAGS: 00010246
[ 1838.432458] RAX: 000000000000003c RBX: 0000000000000000 RCX: 0000000000000001
[ 1838.433967] RDX: 0000000000000000 RSI: ffff8800369c08c0 RDI: ffff8803d2821500
[ 1838.435478] RBP: 0000000000000000 R08: ffff8803cd2aa440 R09: 0000000000000000
[ 1838.436973] R10: ffff8803d1438000 R11: 0000000000005000 R12: ffff8803d2821500
[ 1838.438453] R13: 0000000000000008 R14: ffff8800369c08c0 R15: ffff8803d2821500
[ 1838.439928] FS: 0000000000000000(0000) GS:ffff88042fc00000(0000) knlGS:0000000000000000
[ 1838.441389] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1838.442829] CR2: 0000000000000058 CR3: 0000000001813000 CR4: 00000000001407f0
[ 1838.444270] Stack:
[ 1838.445690] ffff88041a40314c 0000000000000003 ffff88041a403140 ffff880419e3d30c
[ 1838.447099] 00000000000000a5 ffff88041a496a00 0000000000000000 ffff8800369c0000
[ 1838.448512] ffff8803d2821500 ffff8800369d0000 ffff8803d2821500 ffffffffa11ad487
[ 1838.449907] Call Trace:
[ 1838.451291] [] ? _ZN8ToDevice12queue_packetEP6PacketP12netdev_queue+0xc7/0x200 [click]
[ 1838.452672] [] ? _ZN8ToDevice8run_taskEP4Task+0xf8/0x480 [click]
[ 1838.454030] [] ? _ZN12RouterThread6driverEv+0x42d/0x5f0 [click]
[ 1838.455368] [] ? _ZL11click_schedPv+0x158/0x320 [click]
[ 1838.456688] [] ? __schedule+0x2b1/0x710
[ 1838.458004] [] ? _Z19click_cleanup_schedv+0x160/0x160 [click]
[ 1838.459280] [] ? kthread+0xbd/0xe0
[ 1838.460548] [] ? kthread_create_on_node+0x180/0x180
[ 1838.461795] [] ? ret_from_fork+0x7c/0xb0
[ 1838.463013] [] ? kthread_create_on_node+0x180/0x180
[ 1838.464227] Code: 83 e9 01 31 c0 45 0f b7 c9 49 83 c1 01 49 c1 e1 04 90 41 8b 7c 00 3c 48 83 c0 10 8d 97 ff 3f 00 00 c1 ea 0e 01 d1 4c 39 c8 75 e7 <0f> b7 43 58 0f b7 73 5a 83 c1 03 31 d2 66 39 f0 66 0f 43 53 54
[ 1838.466871] RIP [] ixgbe_xmit_frame_ring+0x79/0xc70 [ixgbe]
[ 1838.468126] RSP
[ 1838.469356] CR2: 0000000000000058

from click.

pallas avatar pallas commented on August 19, 2024

If this makes you feel any better (or worse) my team had similar experiences with that driver and we always use the Intel version now. Sorry I didn't see this issue earlier since I probably could have saved you some time.

from click.

tbarbette avatar tbarbette commented on August 19, 2024

I spoke too quickly. It's maybe another problem but it doesn't work if I use multiqueue...

This seems to be because even with single-thread click, packet_notifier_hook() in fromdevice.cc can be called concurrently, as they are multiple interrupts comming from the card on multiple CPUs. Adding a big lock fixes the problem. I double check that, think a little to a better solution (I'd say atomic increment on the queue head) and come back with a patch...

from click.

tbarbette avatar tbarbette commented on August 19, 2024

This was solved by #182

from click.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.