summaryrefslogtreecommitdiff
path: root/samples
Commit message (Collapse)AuthorAge
...
| * | | samples/bpf: xdp_monitor include cpumap tracepoints in monitoringJesper Dangaard Brouer2018-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xdp_redirect_cpu sample have some "builtin" monitoring of the tracepoints for xdp_cpumap_*, but it is practical to have an external tool that can monitor these transpoint as an easy way to troubleshoot an application using XDP + cpumap. Specifically I need such external tool when working on Suricata and XDP cpumap redirect. Extend the xdp_monitor tool sample with monitoring of these xdp_cpumap_* tracepoints. Model the output format like xdp_redirect_cpu. Given I needed to handle per CPU decoding for cpumap, this patch also add per CPU info on the existing monitor events. This resembles part of the builtin monitoring output from sample xdp_rxq_info. Thus, also covering part of that sample in an external monitoring tool. Performance wise, the cpumap tracepoints uses bulking, which cause them to have very little overhead. Thus, they are enabled by default. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | samples/bpf: xdp2skb_meta comment explain why pkt-data pointers are invalidatedJesper Dangaard Brouer2018-01-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the 'unknown reason' comment, with an actual explaination of why the ctx pkt-data pointers need to be loaded after the helper function bpf_xdp_adjust_meta(). Based on the explaination Daniel gave. Fixes: 36e04a2d78d9 ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | samples/bpf: Fix trailing semicolonLuis de Bethencourt2018-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | samples/bpf: xdp2skb_meta shows transferring info from XDP to SKBJesper Dangaard Brouer2018-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creating a bpf sample that shows howto use the XDP 'data_meta' infrastructure, created by Daniel Borkmann. Very few drivers support this feature, but I wanted a functional sample to begin with, when working on adding driver support. XDP data_meta is about creating a communication channel between BPF programs. This can be XDP tail-progs, but also other SKB based BPF hooks, like in this case the TC clsact hook. In this sample I show that XDP can store info named "mark", and TC/clsact chooses to use this info and store it into the skb->mark. It is a bit annoying that XDP and TC samples uses different tools/libs when attaching their BPF hooks. As the XDP and TC programs need to cooperate and agree on a struct-layout, it is best/easiest if the two programs can be contained within the same BPF restricted-C file. As the bpf-loader, I choose to not use bpf_load.c (or libbpf), but instead wrote a bash shell scripted named xdp2skb_meta.sh, which demonstrate howto use the iproute cmdline tools 'tc' and 'ip' for loading BPF programs. To make it easy for first time users, the shell script have command line parsing, and support --verbose and --dry-run mode, if you just want to see/learn the tc+ip command syntax: # ./xdp2skb_meta.sh --dev ixgbe2 --dry-run # Dry-run mode: enable VERBOSE and don't call TC+IP tc qdisc del dev ixgbe2 clsact tc qdisc add dev ixgbe2 clsact tc filter add dev ixgbe2 ingress prio 1 handle 1 bpf da obj ./xdp2skb_meta_kern.o sec tc_mark # Flush XDP on device: ixgbe2 ip link set dev ixgbe2 xdp off ip link set dev ixgbe2 xdp obj ./xdp2skb_meta_kern.o sec xdp_mark Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | samples/bpf: program demonstrating access to xdp_rxq_infoJesper Dangaard Brouer2018-01-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This sample program can be used for monitoring and reporting how many packets per sec (pps) are received per NIC RX queue index and which CPU processed the packet. In itself it is a useful tool for quickly identifying RSS imbalance issues, see below. The default XDP action is XDP_PASS in-order to provide a monitor mode. For benchmarking purposes it is possible to specify other XDP actions on the cmdline --action. Output below shows an imbalance RSS case where most RXQ's deliver to CPU-0 while CPU-2 only get packets from a single RXQ. Looking at things from a CPU level the two CPUs are processing approx the same amount, BUT looking at the rx_queue_index levels it is clear that RXQ-2 receive much better service, than other RXQs which all share CPU-0. Running XDP on dev:i40e1 (ifindex:3) action:XDP_PASS XDP stats CPU pps issue-pps XDP-RX CPU 0 900,473 0 XDP-RX CPU 2 906,921 0 XDP-RX CPU total 1,807,395 RXQ stats RXQ:CPU pps issue-pps rx_queue_index 0:0 180,098 0 rx_queue_index 0:sum 180,098 rx_queue_index 1:0 180,098 0 rx_queue_index 1:sum 180,098 rx_queue_index 2:2 906,921 0 rx_queue_index 2:sum 906,921 rx_queue_index 3:0 180,098 0 rx_queue_index 3:sum 180,098 rx_queue_index 4:0 180,082 0 rx_queue_index 4:sum 180,082 rx_queue_index 5:0 180,093 0 rx_queue_index 5:sum 180,093 Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2017-12-18
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit e87c6bc3852b ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | samples/bpf: add a test for bpf_override_returnJosef Bacik2017-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a basic test for bpf_override_return to verify it works. We override the main function for mounting a btrfs fs so it'll return -ENOMEM and then make sure that trying to mount a btrfs fs will fail. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| * | | | samples/bpf: add erspan v2 sample codeWilliam Tu2017-12-15
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | Extend the existing tests for ipv4 ipv6 erspan version 2. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | samples/bpf: add ip6erspan sample codeWilliam Tu2017-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the existing tests for ip6erspan. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-12-05
| |\ \ \ | | | |/ | | |/| | | | | | | | | | | | | | | | | Small overlapping change conflict ('net' changed a line, 'net-next' added a line right afterwards) in flexcan.c Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2017-12-04
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-03 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Addition of a software model for BPF offloads in order to ease testing code changes in that area and make semantics more clear. This is implemented in a new driver called netdevsim, which can later also be extended for other offloads. SR-IOV support is added as well to netdevsim. BPF kernel selftests for offloading are added so we can track basic functionality as well as exercising all corner cases around BPF offloading, from Jakub. 2) Today drivers have to drop the reference on BPF progs they hold due to XDP on device teardown themselves. Change this in order to make XDP handling inside the drivers less error prone, and move disabling XDP to the core instead, also from Jakub. 3) Misc set of BPF verifier improvements and cleanups as preparatory work for upcoming BPF-to-BPF calls. Among others, this set also improves liveness marking such that pruning can be slightly more effective. Register and stack liveness information is now included in the verifier log as well, from Alexei. 4) nfp JIT improvements in order to identify load/store sequences in the BPF prog e.g. coming from memcpy lowering and optimizing them through the NPU's command push pull (CPP) instruction, from Jiong. 5) Cleanups to test_cgrp2_attach2.c BPF sample code in oder to remove bpf_prog_attach() magic values and replacing them with actual proper attach flag instead, from David. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | samples/bpf: Convert magic numbers to names in multi-prog cgroup test caseDavid Ahern2017-11-30
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attach flag 1 == BPF_F_ALLOW_OVERRIDE; attach flag 2 == BPF_F_ALLOW_MULTI. Update the calls to bpf_prog_attach() in test_cgrp2_attach2.c to use the names over the magic numbers. Fixes: 39323e788cb67 ("samples/bpf: add multi-prog cgroup test case") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| * | | samples/bpf: extend test_tunnel_bpf.sh with ip6greWilliam Tu2017-12-04
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | Extend existing tests for vxlan, gre, geneve, ipip, erspan, to include ip6 gre and gretap tunnel. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2018-01-31
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching Pull livepatching updates from Jiri Kosina: - handle 'infinitely'-long sleeping tasks, from Miroslav Benes - remove 'immediate' feature, as it turns out it doesn't provide the originally expected semantics, and brings more issues than value * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: livepatch: add locking to force and signal functions livepatch: Remove immediate feature livepatch: force transition to finish livepatch: send a fake signal to all blocking tasks
| * | livepatch: Remove immediate featureMiroslav Benes2018-01-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Immediate flag has been used to disable per-task consistency and patch all tasks immediately. It could be useful if the patch doesn't change any function or data semantics. However, it causes problems on its own. The consistency problem is currently broken with respect to immediate patches. func a patches 1i 2i 3 When the patch 3 is applied, only 2i function is checked (by stack checking facility). There might be a task sleeping in 1i though. Such task is migrated to 3, because we do not check 1i in klp_check_stack_func() at all. Coming atomic replace feature would be easier to implement and more reliable without immediate. Thus, remove immediate feature completely and save us from the problems. Note that force feature has the similar problem. However it is considered as a last resort. If used, administrator should not apply any new live patches and should plan for reboot into an updated kernel. The architectures would now need to provide HAVE_RELIABLE_STACKTRACE to fully support livepatch. Signed-off-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2017-12-03
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2017-12-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a compilation warning in xdp redirect tracepoint due to missing bpf.h include that pulls in struct bpf_map, from Xie. 2) Limit the maximum number of attachable BPF progs for a given perf event as long as uabi is not frozen yet. The hard upper limit is now 64 and therefore the same as with BPF multi-prog for cgroups. Also add related error checking for the sample BPF loader when enabling and attaching to the perf event, from Yonghong. 3) Specifically set the RLIMIT_MEMLOCK for the test_verifier_log case, so that the test case can always pass and not fail in some environments due to too low default limit, also from Yonghong. 4) Fix up a missing license header comment for kernel/bpf/offload.c, from Jakub. 5) Several fixes for bpftool, among others a crash on incorrect arguments when json output is used, error message handling fixes on unknown options and proper destruction of json writer for some exit cases, all from Quentin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | samples/bpf: add error checking for perf ioctl calls in bpf loaderYonghong Song2017-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | load_bpf_file() should fail if ioctl with command PERF_EVENT_IOC_ENABLE and PERF_EVENT_IOC_SET_BPF fails. When they do fail, proper error messages are printed. With this change, the below "syscall_tp" run shows that the maximum number of bpf progs attaching to the same perf tracepoint is indeed enforced. $ ./syscall_tp -i 64 prog #0: map ids 4 5 ... prog #63: map ids 382 383 $ ./syscall_tp -i 65 prog #0: map ids 4 5 ... prog #64: map ids 388 389 ioctl PERF_EVENT_IOC_SET_BPF failed err Argument list too long Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
* | | kbuild: remove all dummy assignments to obj-Masahiro Yamada2017-11-18
|/ / | | | | | | | | | | | | Now kbuild core scripts create empty built-in.o where necessary. Remove "obj- := dummy.o" tricks. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
* | Merge tag 'media/v4.15-1' of ↵Linus Torvalds2017-11-15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - Documentation for digital TV (both kAPI and uAPI) are now in sync with the implementation (except for legacy/deprecated ioctls). This is a major step, as there were always a gap there - New sensor driver: imx274 - New cec driver: cec-gpio - New platform driver for rockship rga and tegra CEC - New RC driver: tango-ir - Several cleanups at atomisp driver - Core improvements for RC, CEC, V4L2 async probing support and DVB - Lots of drivers cleanup, fixes and improvements. * tag 'media/v4.15-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (332 commits) dvb_frontend: don't use-after-free the frontend struct media: dib0700: fix invalid dvb_detach argument media: v4l2-ctrls: Don't validate BITMASK twice media: s5p-mfc: fix lockdep warning media: dvb-core: always call invoke_release() in fe_free() media: usb: dvb-usb-v2: dvb_usb_core: remove redundant code in dvb_usb_fe_sleep media: au0828: make const array addr_list static media: cx88: make const arrays default_addr_list and pvr2000_addr_list static media: drxd: make const array fastIncrDecLUT static media: usb: fix spelling mistake: "synchronuously" -> "synchronously" media: ddbridge: fix build warnings media: av7110: avoid 2038 overflow in debug print media: Don't do DMA on stack for firmware upload in the AS102 driver media: v4l: async: fix unregister for implicitly registered sub-device notifiers media: v4l: async: fix return of unitialized variable ret media: imx274: fix missing return assignment from call to imx274_mode_regs media: camss-vfe: always initialize reg at vfe_set_xbar_cfg() media: atomisp: make function calls cleaner media: atomisp: get rid of storage_class.h media: atomisp: get rid of wrong stddef.h include ...
| * \ Merge tag 'staging-4.15-rc1' into v4l_for_linusMauro Carvalho Chehab2017-11-14
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some conflicts between staging and media trees, as reported by Stephen Rothwell <sfr@canb.auug.org.au>. So, merge from staging. * tag 'staging-4.15-rc1': (775 commits) staging: lustre: add SPDX identifiers to all lustre files staging: greybus: Remove redundant license text staging: greybus: add SPDX identifiers to all greybus driver files staging: ccree: simplify ioread/iowrite staging: ccree: simplify registers access staging: ccree: simplify error handling logic staging: ccree: remove dead code staging: ccree: handle limiting of DMA masks staging: ccree: copy IV to DMAable memory staging: fbtft: remove redundant initialization of buf staging: sm750fb: Fix parameter mistake in poke32 staging: wilc1000: Fix bssid buffer offset in Txq staging: fbtft: fb_ssd1331: fix mirrored display staging: android: Fix checkpatch.pl error staging: greybus: loopback: convert loopback to use generic async operations staging: greybus: operation: add private data with get/set accessors staging: greybus: loopback: Fix iteration count on async path staging: greybus: loopback: Hold per-connection mutex across operations staging: greybus/loopback: use ktime_get() for time intervals staging: fsl-dpaa2/eth: Extra headroom in RX buffers ... Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
| * | | [media] media: v4l2-pci-skeleton: Fix error handling path in 'skeleton_probe()'Christophe JAILLET2017-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If this memory allocation fails, we must release some resources, as already done in the code below and above. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2017-11-15
|\ \ \ \ | |_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: "Highlights: 1) Maintain the TCP retransmit queue using an rbtree, with 1GB windows at 100Gb this really has become necessary. From Eric Dumazet. 2) Multi-program support for cgroup+bpf, from Alexei Starovoitov. 3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew Lunn. 4) Add meter action support to openvswitch, from Andy Zhou. 5) Add a data meta pointer for BPF accessible packets, from Daniel Borkmann. 6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet. 7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli. 8) More work to move the RTNL mutex down, from Florian Westphal. 9) Add 'bpftool' utility, to help with bpf program introspection. From Jakub Kicinski. 10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper Dangaard Brouer. 11) Support 'blocks' of transformations in the packet scheduler which can span multiple network devices, from Jiri Pirko. 12) TC flower offload support in cxgb4, from Kumar Sanghvi. 13) Priority based stream scheduler for SCTP, from Marcelo Ricardo Leitner. 14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg. 15) Add RED qdisc offloadability, and use it in mlxsw driver. From Nogah Frankel. 16) eBPF based device controller for cgroup v2, from Roman Gushchin. 17) Add some fundamental tracepoints for TCP, from Song Liu. 18) Remove garbage collection from ipv6 route layer, this is a significant accomplishment. From Wei Wang. 19) Add multicast route offload support to mlxsw, from Yotam Gigi" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits) tcp: highest_sack fix geneve: fix fill_info when link down bpf: fix lockdep splat net: cdc_ncm: GetNtbFormat endian fix openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start netem: remove unnecessary 64 bit modulus netem: use 64 bit divide by rate tcp: Namespace-ify sysctl_tcp_default_congestion_control net: Protect iterations over net::fib_notifier_ops in fib_seq_sum() ipv6: set all.accept_dad to 0 by default uapi: fix linux/tls.h userspace compilation error usbnet: ipheth: prevent TX queue timeouts when device not ready vhost_net: conditionally enable tx polling uapi: fix linux/rxrpc.h userspace compilation errors net: stmmac: fix LPI transitioning for dwmac4 atm: horizon: Fix irq release error net-sysfs: trigger netlink notification on ifalias change via sysfs openvswitch: Using kfree_rcu() to simplify the code openvswitch: Make local function ovs_nsh_key_attr_size() static openvswitch: Fix return value check in ovs_meter_cmd_features() ...
| * | | xdp: sample: Missing curly braces in read_route()Dan Carpenter2017-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The assert statement is supposed to be part of the else branch but the curly braces were accidentally left off. Fixes: 3e29cd0e6563 ("xdp: Sample xdp program implementing ip forward") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Revert bpf_overrid_function() helper changes.David S. Miller2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | NACK'd by x86 maintainer. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_clamp_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_iw_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_cong_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_bufs_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_rwnd_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Fix tcp_synrto_kern.c sample programLawrence Brakmo2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The program was returning -1 in some cases which is not allowed by the verifier any longer. Fixes: 390ee7e29fc8 ("bpf: enforce return code for cgroup-bpf programs") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | samples/bpf: add a test for bpf_override_returnJosef Bacik2017-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a basic test for bpf_override_return to verify it works. We override the main function for mounting a btrfs fs so it'll return -ENOMEM and then make sure that trying to mount a btrfs fs will fail. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: Rename tcp_bbf.readme to tcp_bpf.readmeLawrence Brakmo2017-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original patch had the wrong filename. Fixes: bfdf75693875 ("bpf: create samples/bpf/tcp_bpf.readme") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | xdp: Sample xdp program implementing ip forwardChristina Jacob2017-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implements port to port forwarding with route table and arp table lookup for ipv4 packets using bpf_redirect helper function and lpm_trie map. Signed-off-by: Christina Jacob <Christina.Jacob@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bpf: move cgroup_helpers from samples/bpf/ to tools/testing/selftesting/bpf/Roman Gushchin2017-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of this move is to use these files in bpf tests. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-11-04
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | samples/pktgen: remove remaining old pktgen sample scriptsJesper Dangaard Brouer2017-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 0f06a6787e05 ("samples: Add an IPv6 '-6' option to the pktgen scripts") the newer pktgen_sampleXX script does show howto use IPv6 with pktgen. Thus, there is no longer a reason to keep the older sample scripts around. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | samples/pktgen: update sample03, no need for clones when burstingJesper Dangaard Brouer2017-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like sample05, don't use pktgen clone_skb feature when using 'burst' feature, it is not really needed. This brings the burst users in sync. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | samples/pktgen: add script pktgen_sample06_numa_awared_queue_irq_affinity.shRobert Hoo2017-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This script simply does: * Detect $DEV's NUMA node belonging. * Bind each thread (processor of NUMA locality) with each $DEV queue's irq affinity, 1:1 mapping. * How many '-t' threads input determines how many queues will be utilized. If '-f' designates first cpu id, then offset in the NUMA node's cpu list. (Changes by Jesper: allow changing count from cmdline via '-n') Signed-off-by: Robert Hoo <robert.hu@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | samples/pktgen: Add some helper functionsRobert Hoo2017-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. given a device, get its NUMA belongings 2. given a device, get its queues' irq numbers. 3. given a NUMA node, get its cpu id list. Signed-off-by: Robert Hoo <robert.hu@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-10-30
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp_redirect_mapTushar Dave2017-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@labbpf]# ./xdp_redirect_map $(</sys/class/net/eth2/ifindex) \ > $(</sys/class/net/eth3/ifindex) failed to create a map: 1 Operation not permitted The failure is 100% when multiple xdp programs are running. Fix it. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | samples/bpf: adjust rlimit RLIMIT_MEMLOCK for xdp1Tushar Dave2017-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@lab bpf]#./xdp1 -N $(</sys/class/net/eth2/ifindex) failed to create a map: 1 Operation not permitted Fix it. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: add a test case to test single tp multiple bpf attachmentYonghong Song2017-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bpf sample program syscall_tp is modified to show attachment of more than bpf programs for a particular kernel tracepoint. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2017-10-22
| |\ \ \ \ \ | | | |_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were quite a few overlapping sets of changes here. Daniel's bug fix for off-by-ones in the new BPF branch instructions, along with the added allowances for "data_end > ptr + x" forms collided with the metadata additions. Along with those three changes came veritifer test cases, which in their final form I tried to group together properly. If I had just trimmed GIT's conflict tags as-is, this would have split up the meta tests unnecessarily. In the socketmap code, a set of preemption disabling changes overlapped with the rename of bpf_compute_data_end() to bpf_compute_data_pointers(). Changes were made to the mv88e6060.c driver set addr method which got removed in net-next. The hyperv transport socket layer had a locking change in 'net' which overlapped with a change of socket state macro usage in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: create samples/bpf/tcp_bpf.readmeLawrence Brakmo2017-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Readme file explaining how to create a cgroupv2 and attach one of the tcp_*_kern.o socket_ops BPF program. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked_by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: sample BPF_SOCKET_OPS_BASE_RTT programLawrence Brakmo2017-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sample socket_ops BPF program to test the BPF helper function bpf_getsocketops and the new socket_ops op BPF_SOCKET_OPS_BASE_RTT. The program provides a base RTT of 80us when the calling flow is within a DC (as determined by the IPV6 prefix) and the congestion algorithm is "nv". Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked_by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | samples/bpf: add cpumap sample program xdp_redirect_cpuJesper Dangaard Brouer2017-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This sample program show how to use cpumap and the associated tracepoints. It provides command line stats, which shows how the XDP-RX process, cpumap-enqueue and cpumap kthread dequeue is cooperating on a per CPU basis. It also utilize the xdp_exception and xdp_redirect_err transpoints to allow users quickly to identify setup issues. One issue with ixgbe driver is that the driver reset the link when loading XDP. This reset the procfs smp_affinity settings. Thus, after loading the program, these must be reconfigured. The easiest workaround it to reduce the RX-queue to e.g. two via: # ethtool --set-channels ixgbe1 combined 2 And then add CPUs above 0 and 1, like: # xdp_redirect_cpu --dev ixgbe1 --prog 2 --cpu 2 --cpu 3 --cpu 4 Another issue with ixgbe is that the page recycle mechanism is tied to the RX-ring size. And the default setting of 512 elements is too small. This is the same issue with regular devmap XDP_REDIRECT. To overcome this I've been using 1024 rx-ring size: # ethtool -G ixgbe1 rx 1024 tx 1024 V3: - whitespace cleanups - bpf tracepoint cannot access top part of struct V4: - report on kthread sched events, according to tracepoint change - report average bulk enqueue size V5: - bpf_map_lookup_elem on cpumap not allowed from bpf_prog use separate map to mark CPUs not available V6: - correct kthread sched summary output V7: - Added a --stress-mode for concurrently changing underlying cpumap Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: Add -target to clang switch while cross compiling.Abhijit Ayarekar2017-10-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update to llvm excludes assembly instructions. llvm git revision is below commit 65fad7c26569 ("bpf: add inline-asm support") This change will be part of llvm release 6.0 __ASM_SYSREG_H define is not required for native compile. -target switch includes appropriate target specific files while cross compiling Tested on x86 and arm64. Signed-off-by: Abhijit Ayarekar <abhijit.ayarekar@caviumnetworks.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: add a test case for helper bpf_perf_prog_read_valueYonghong Song2017-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bpf sample program trace_event is enhanced to use the new helper to print out enabled/running time. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | bpf: add a test case for helper bpf_perf_event_read_valueYonghong Song2017-10-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bpf sample program tracex6 is enhanced to use the new helper to read enabled/running time as well. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>