One approach to indirect call optimization
by Mat Martineau
I noticed this patch on netdev to avoid an indirect call to md5_lookup,
which was accepted. It is mitigating the cost of an existing indirect call
rather than adding a new one, but shows how the maintainers are looking at
the problem.
--
Mat Martineau
Intel OTC
---------- Forwarded message ----------
Date: Mon, 23 Apr 2018 14:46:25
From: Eric Dumazet <edumazet(a)google.com>
To: David S . Miller <davem(a)davemloft.net>
Cc: netdev <netdev(a)vger.kernel.org>, Eric Dumazet <edumazet(a)google.com>,
Eric Dumazet <eric.dumazet(a)gmail.com>
Subject: [PATCH net-next] tcp: md5: only call tp->af_specific->md5_lookup() for
md5 sockets
RETPOLINE made calls to tp->af_specific->md5_lookup() quite expensive,
given they have no result.
We can omit the calls for sockets that have no md5 keys.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
---
net/ipv4/tcp_output.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 383cac0ff0ec059ca7dbc1a6304cc7f8183e008d..95feffb6d53f8a9eadfb15a2fffeec498d6e993a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -585,14 +585,15 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb,
unsigned int remaining = MAX_TCP_OPTION_SPACE;
struct tcp_fastopen_request *fastopen = tp->fastopen_req;
+ *md5 = NULL;
#ifdef CONFIG_TCP_MD5SIG
- *md5 = tp->af_specific->md5_lookup(sk, sk);
- if (*md5) {
- opts->options |= OPTION_MD5;
- remaining -= TCPOLEN_MD5SIG_ALIGNED;
+ if (unlikely(rcu_access_pointer(tp->md5sig_info))) {
+ *md5 = tp->af_specific->md5_lookup(sk, sk);
+ if (*md5) {
+ opts->options |= OPTION_MD5;
+ remaining -= TCPOLEN_MD5SIG_ALIGNED;
+ }
}
-#else
- *md5 = NULL;
#endif
/* We always get an MSS option. The option bytes which will be seen in
@@ -720,14 +721,15 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
opts->options = 0;
+ *md5 = NULL;
#ifdef CONFIG_TCP_MD5SIG
- *md5 = tp->af_specific->md5_lookup(sk, sk);
- if (unlikely(*md5)) {
- opts->options |= OPTION_MD5;
- size += TCPOLEN_MD5SIG_ALIGNED;
+ if (unlikely(rcu_access_pointer(tp->md5sig_info))) {
+ *md5 = tp->af_specific->md5_lookup(sk, sk);
+ if (*md5) {
+ opts->options |= OPTION_MD5;
+ size += TCPOLEN_MD5SIG_ALIGNED;
+ }
}
-#else
- *md5 = NULL;
#endif
if (likely(tp->rx_opt.tstamp_ok)) {
--
2.17.0.484.g0c8726318c-goog
2 years, 1 month
[Weekly meetings] no meeting this week
by Matthieu Baerts
Hello,
To avoid having a meeting in the middle of all those pumpkins, it has
been decided that there would not be any meeting this week. The next
one would then be the 8th of November but at 17:00 UTC (9am PST, 6pm
CET).
The list of topics will be available here:
https://annuel2.framapad.org/p/mptcp_upstreaming_20181108
Feel free to add/modify topics!
Have a good week,
Matt
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
2 years, 2 months
[Weekly meetings] MoM - 25th of October 2018
by Matthieu Baerts
Hello,
We just had our 27th meeting with Mat, Peter and Ossama (Intel OTC),
Christoph (Apple) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
current MPTCP:
- v0.93.2 finally released:
https://github.com/multipath-tcp/mptcp/releases/tag/v0.93.2
- should be announced soon
- scripts available to ease the build:
https://github.com/multipath-tcp/mptcp-scripts/pull/7
Mat:
- continued to work on the v3 of the patch
- did experimentation with TopGit
- Podcast: Hosts are SDN-oriented, and may be more comfortable
talking about network infrastructure. We should be prepared to talk
about lower-layer considerations (like middleboxes and preserving TCP
options), as well as giving an overview of how MPTCP works.
Matthieu:
- did a demo of TopGit: https://github.com/mackyle/topgit
tg create + tg update + tg delete
we didn't cover `tg remote origin --populate` + `tg push` yet
- did a demo of Gerrit using 'git review':
https://review.gerrithub.io/#/c/multipath-tcp/mptcp_net-next/+/430835
git checkout -b my_branch master
(...) # work here
git add -u <...>
git commit -sm "foo" # Change-Id has been added
git review # new Change created
(...) # rework here according to the review
git commit --amend # Change-Id has not been modified
git review # Change 430835 has been updated
It is possible to see only the change from v1 to v2, not only from
v0 (the base) to v2 (the final version):
https://review.gerrithub.io/c/multipath-tcp/mptcp_net-next/+/430835/1..2
important note: if you want to rebase your change and do other
modifications, it is easier for the reviewers to first rebase it, send
the new version (with the WIP feature), then do the modifications after.
Otherwise rebase and modifications changes are "merged", the diff might
show other modifications from the rebase.
Git review tool:
https://docs.openstack.org/infra/git-review/installation.html
- Gerrit repo is available:
https://review.gerrithub.io/admin/projects/multipath-tcp/mptcp_net-next
.gitreview file:
=======
[gerrit]
host=review.gerrithub.io
port=29418
project=multipath-tcp/mptcp_net-next
defaultbranch=master
=======
- You can be access to Gerrit via:
https://review.gerrithub.io/plugins/github-plugin/static/scope.html
→ Choose "Reviewer" to only review work, "DEFAULT" to contribute.
- Anybody can create Changes, I can add people in different groups,
change rules, etc.:
https://github.com/multipath-tcp/mptcp_net-next/commit/aead15fd5e245f2876...
- Don't forget to subscribe to new Changes:
https://review.gerrithub.io/settings/#Notifications
Next meeting:
- We didn't decide when to have the next meeting yet: we might skip
the one of next week if it is not needed or it could be the 1st of
November. Be careful with the time, it would still be at 16:00 UTC, 9am
PDT but 5pm CEST (it is no longer the summer in Europe, but still in the
US!)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20181101
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matthieu
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
2 years, 2 months
[Weekly meetings] MoM - 18th of October 2018
by Matthieu Baerts
Hello,
We just had our 26th meeting with Mat, Peter and Ossama (Intel OTC),
Christoph (Apple) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
podcast:
- Invited to participate to
https://www.ipspace.net/Podcast/Software_Gone_Wild/About
- after the 19th of November (but still at 7AM :-D )
- With Mat and Christoph (Matthieu as backup if they don't wake up!
:-D )
Design:
- new page by Mat:
https://github.com/multipath-tcp/mptcp_net-next/wiki/Design#api-design
- sock = socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP);
- IPPROTO_MPTCP: Various combinations of AF_INET, AF_INET6, and
the IPV6_V6ONLY socket option provide a way to control v4/v6 selection
for the initial subflow and whether later subflows may be mixed v4/v6.
- important: we could also use BPF to force IPPROTO_MPTCP (or
AF_MULTIPATH) not to modified other apps
tests:
- netdevsim: not for us, for hardware devs, at least for the moment
- packetdrill:
- no news
- Intel team might participate, just some admin stuff to do before
tools: gerrit
- http://gerrithub.io/
- had some issues but should be fixed on the service side
- Matth: I can do a demo next week. Same for Topgit
Mat:
- worked on a new RFC version of the patch-set
- when pushing on kernel.org, there are automated build checking
things → Mat fixed some warnings generated by this service
Peter:
- working on not "exposing" IPPROTO_TCP_SUBFLOWS to userspace
- found a way with ULP, not too much changes!
- will be shared soon
Christoph:
- continuing the cleaning, working on that
Ossama:
- still in the process of releasing the userspace code
Matth:
- progress on testing mptcp_trunk
- progress on creating a new release based on v4.9 kernel
Next meeting:
- We propose to have it on Thursday, the 25th of October. Usual
time: 9am PDT - 16:00 UTC (9am PDT, 6pm CEST)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20181025
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matthieu
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
2 years, 3 months
[PATCH] mptcp: Use tcp_abort correctly for MPTCP
by Matthieu Baerts
With MPTCP, we have a modified version of tcp_send_active_reset(). Let's
use it if needed.
Fixes: 10c57b6f07ac (Merge tag 'v4.5' into mptcp_trunk)
Fixes: 07abdbda3bd6 (mptcp: Use tcp_abort correctly for MPTCP)
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
Signed-off-by: Christoph Paasch <cpaasch(a)apple.com>
(cherry picked from commit f2632fa4ee58f2d375e38119633b5739b6d43b2e)
Signed-off-by: Christoph Paasch <cpaasch(a)apple.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts(a)tessares.net>
---
net/ipv4/tcp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3fb5481e78b3..c4d14fbce16b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4026,7 +4026,7 @@ int tcp_abort(struct sock *sk, int err)
smp_wmb();
sk->sk_error_report(sk);
if (tcp_need_reset(sk->sk_state))
- tcp_send_active_reset(sk, GFP_ATOMIC);
+ tcp_sk(sk)->ops->send_active_reset(sk, GFP_ATOMIC);
tcp_done(sk);
}
--
2.17.1
2 years, 3 months
[Weekly meetings] MoM - 11th of October 2018
by Matthieu Baerts
Hello,
We just had our 25th (\o/) meeting with Mat, Peter and Ossama (Intel
OTC), Christoph (Apple) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
Review of Mat & Peter's patches:
- IPPROTO_MPTCP vs AF_MULTIPATH:
- We should document the advantages/disadvantages of each (*Mat*
will try to have a look at that)
- Not an issue to switch from one to another if needed later.
- get rid of SKB's priv_copy()
- we can remove it, it depends who will need it
- difficult what other might need. Maybe better to start simple
with not too much flexibility.
- we will then remove the priv_copy() but we need to keep the
destructor()
- receiving part: DSS mapping: error_queue → dedicated queue? or
only for the signalling?
- Christoph: maybe difficult to manage that in queue.
- maybe using tcp_read_sock()
- we will always need an ofo_queue
- DSS ACK not driven by the app in theory (like TCP) but we
could ignore that for the moment?
- we can manage optimisation later
- what is important is something easy to maintain and understand
for the moment, keep it simple by using the right available structures.
- we can read data from one subflow as long as it is in order
and then switch to the other subflows.
- we need to be careful with the windows: the new window is
checked at every ACK but we need to know when to increase it. First
solution could be to stall the window, close it, then increase it.
- MPTCP: windows are shared between SF, if we share the memory
equally between subflows, that's not the best when a SF contains crap.
- decision: take the error_queue out
- receiving part: bypass coalesce/collapse for MPTCP OR do
coalescing that takes DSS-mappings into account → priv_used is the
perfect infrastructure for that.
- linked to the previous point.
- smart coalesce can be useful for kTLS
- trigger the closing in the right sequence → First, we need to
close the MPTCP-layer and then the subflows.
- optimisation can be done later to close both at the same time
- it is for the last data that need to be sent if you close the
subflows first but still some data coming.
- Christoph will try to find an article to give more details
about that →
https://tools.ietf.org/html/draft-ietf-mptcp-rfc6824bis-12#section-3.3.3
"Essentially, a host MUST NOT close all functioning subflows
unless it is safe to do so, i.e., until all outstanding data has been
DATA_ACKed, or until the segment with the DATA_FIN flag set is the only
outstanding segment."
- prepend MPTCP_ / mptcp_ everywhere → subflow, token, crypto, etc.?
- everything visible should be prepended.
- subflow socket type:
- we could keep a TCP proto but modify the pointers where
needed. We should check how it is done with kTLS.
- we don't want to have the userspace creating a subflow
directly → by using IPPROTO_SUBFLOW
- So overriding pointers would make sense. It is a special proto
for MPTCP, kernel side only.
testing MPTCP:
- packetdrill: no more news from Neil (see the discussions we had
last week)
- netdevsim?
- Seems oriented for drivers devs:
https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.16-Networking
- we need to check if it can be useful for us → Matt will do that
tools: start using TopGit?
- maybe too soon for our repo
- would be nice to already do some experiments with that:
https://github.com/mackyle/topgit
- once Mat and Peter's patches are accepted maybe
- we can do that later
tools: use Gerrit for the reviews?
- http://gerrithub.io/
- can be helpful, easier to keep track on stuff than with a ML
because there might be a lot of discussions around patches for this project.
- Matth will have a look at this
- (still possible to keep both, we can quickly experiment)
Next meeting:
- We propose to have it on Thursday, the 18th of October. Usual
time: 9am PDT - 16:00 UTC (9am PDT, 6pm CEST)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20181018
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matthieu
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
2 years, 3 months
[RFC PATCH v3 00/16] MPTCP architecture proposal for discussion
by Mat Martineau
Hello everyone,
Peter and I have been working on this patch set to show how how MPTCP
can fit in to the Linux networking stack using these design ideas:
* Applications opt-in to MPTCP using IPPROTO_MPTCP, regular TCP sockets
are still the default. A socket created with
socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP) will attempt to form a
MPTCP connection. IPPROTO_MPTCP == 99 as a placeholder.
* Subflows exist within the kernel as separate sockets, owned by a
MPTCP connection-level socket that is visible to userspace.
* Adds private pointers to struct sk_buff to store MPTCP metadata.
* Adds the CONFIG_MPTCP option to Kconfig.
The following patches can form an MPTCP connection with the
multipath-tcp.org kernel (tested with v0.94), and send DSS mappings that
are accepted for the initial data packet. It is an early implementation,
and I don't represent it as being upstreamable as-is or being everyone's
idea of what an eventual upstream implementation will necessarily look
like. It has significant limitations:
* Only one subflow is supported, no joins, and only ipv4.
* Does not support DSS checksums. Checksums must be disabled on the
remote stack (for multipath-tcp.org, 'sudo sysctl -w
net.mptcp.mptcp_checksum=0')
* Lots of debug statements (although they use dynamic debug and are
disabled by default) and TODOs.
* It's only been tested sending small amounts of data for each send
Hopefully there are are some interesting concepts to discuss, and this
code helps us assess how workable the above design principles
are. Thanks in advance for your feedback on the benefits or drawbacks of
this code, how it might be improved, or how other approaches might
compare.
The patch set applies to net-next (as of commit 2ad0d5269970). I have also
pushed it to:
https://git.kernel.org/pub/scm/linux/kernel/git/martineau/linux.git
(mptcp-proposal branch)
v3 changes: Change skb extension technique, change rx path to use error
queue, add foundational code for multiple subflows, and many bug fixes.
v2 changes: Added receive path implementation (last two patches).
Reworked TCP option writing. Miscellaneous bug fixes including
header dependency cleanup.
Mat Martineau (6):
tcp: Add MPTCP option number
tcp: Define IPPROTO_MPTCP
skbuff: Add private data pointer
tcp: Export low-level TCP functions
mptcp: Write MPTCP DSS headers to outgoing data packets
mptcp: Implement MPTCP receive path
Peter Krystad (10):
mptcp: Add MPTCP socket stubs
mptcp: Handle MPTCP TCP options
tcp: Add IPPROTO_SUBFLOW
tcp: expose tcp routines and structs for MPTCP
mptcp: Create SUBFLOW socket for outgoing connections
mptcp: Create SUBFLOW socket for incoming connections
mptcp: Add key generation and token tree
mptcp: Add shutdown() socket operation
mptcp: Add setsockopt()/getsockopt() socket operations
mptcp: Make connection_list a real list of subflows
include/linux/skbuff.h | 16 +-
include/linux/tcp.h | 26 +
include/net/inet_common.h | 3 +
include/net/mptcp.h | 205 ++++++++
include/net/tcp.h | 8 +
include/uapi/linux/errqueue.h | 1 +
include/uapi/linux/in.h | 4 +
net/Kconfig | 1 +
net/Makefile | 1 +
net/core/skbuff.c | 5 +
net/ipv4/af_inet.c | 2 +-
net/ipv4/tcp.c | 12 +-
net/ipv4/tcp_input.c | 18 +
net/ipv4/tcp_ipv4.c | 4 +-
net/ipv4/tcp_output.c | 239 ++++++++-
net/mptcp/Kconfig | 10 +
net/mptcp/Makefile | 3 +
net/mptcp/crypto.c | 215 ++++++++
net/mptcp/options.c | 263 ++++++++++
net/mptcp/protocol.c | 891 ++++++++++++++++++++++++++++++++++
net/mptcp/subflow.c | 377 ++++++++++++++
net/mptcp/token.c | 265 ++++++++++
22 files changed, 2544 insertions(+), 25 deletions(-)
create mode 100644 include/net/mptcp.h
create mode 100644 net/mptcp/Kconfig
create mode 100644 net/mptcp/Makefile
create mode 100644 net/mptcp/crypto.c
create mode 100644 net/mptcp/options.c
create mode 100644 net/mptcp/protocol.c
create mode 100644 net/mptcp/subflow.c
create mode 100644 net/mptcp/token.c
--
2.19.1
2 years, 3 months