{"id":133033,"date":"2024-08-13T17:38:10","date_gmt":"2024-08-13T17:38:10","guid":{"rendered":"https:\/\/showbizztoday.com\/index.php\/2024\/08\/13\/investigation-of-a-cross-regional-network-performance-issue-by-netflix-technology-blog\/"},"modified":"2024-08-13T17:38:10","modified_gmt":"2024-08-13T17:38:10","slug":"investigation-of-a-cross-regional-network-performance-issue-by-netflix-technology-blog","status":"publish","type":"post","link":"https:\/\/showbizztoday.com\/index.php\/2024\/08\/13\/investigation-of-a-cross-regional-network-performance-issue-by-netflix-technology-blog\/","title":{"rendered":"Investigation of a Cross-regional Network Performance Issue | by Netflix Technology Blog"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<div>\n<div>\n<div class=\"speechify-ignore ab co\">\n<div class=\"speechify-ignore bg l\">\n<div class=\"hw hx hy hz ia ab\">\n<div>\n<div class=\"ab ib\"><a href=\"https:\/\/netflixtechblog.medium.com\/?source=post_page-----422d6218fdf1--------------------------------\" rel=\"noopener follow\" target=\"_blank\"><\/p>\n<div>\n<div class=\"bl\" aria-hidden=\"false\">\n<div class=\"l ic id bx ie if\">\n<div class=\"l fk\"><img decoding=\"async\" alt=\"Netflix Technology Blog\" class=\"l fc bx dc dd cw\" src=\"https:\/\/miro.medium.com\/v2\/resize:fill:88:88\/1*BJWRqfSMf9Da9vsXG9EBRQ.jpeg\" width=\"44\" height=\"44\" loading=\"lazy\" data-testid=\"authorPhoto\"\/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><\/a><a href=\"https:\/\/netflixtechblog.com\/?source=post_page-----422d6218fdf1--------------------------------\" rel=\"noopener  ugc nofollow\" target=\"_blank\"><\/p>\n<div class=\"ii ab fk\">\n<div>\n<div class=\"bl\" aria-hidden=\"false\">\n<div class=\"l ij ik bx ie il\">\n<div class=\"l fk\"><img decoding=\"async\" alt=\"Netflix TechBlog\" class=\"l fc bx bq im cw\" src=\"https:\/\/miro.medium.com\/v2\/resize:fill:48:48\/1*ty4NvNrGg4ReETxqU2N3Og.png\" width=\"24\" height=\"24\" loading=\"lazy\" data-testid=\"publicationPhoto\"\/><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><\/a><\/div>\n<\/div>\n<div class=\"bm bg l\">\n<div class=\"l iy\"><span class=\"be b bf z dt\"><\/p>\n<div class=\"ab cm iz ja jb\"><span class=\"be b bf z dt\"><\/p>\n<div class=\"ab ae\"><span data-testid=\"storyReadTime\">10 min learn<\/span><\/p>\n<p><span class=\"l\" aria-hidden=\"true\"><span class=\"be b bf z dt\">\u00b7<\/span><\/span><\/p>\n<p><span data-testid=\"storyPublishDate\">Apr 24, 2024<\/span><\/div>\n<p><\/span><\/div>\n<p><\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p id=\"680a\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/hechaoli\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Hechao Li<\/a>, <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/rogercruz\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Roger Cruz<\/a><\/p>\n<p id=\"8b0b\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">Netflix operates a extremely environment friendly cloud computing infrastructure that helps a wide selection of functions important for our SVOD (Subscription Video on Demand), stay streaming and gaming companies. Utilizing Amazon AWS, our infrastructure is hosted throughout a number of geographic areas worldwide. This world distribution permits our functions to ship content material extra successfully by serving site visitors nearer to our clients. Like any distributed system, our functions often require information synchronization between areas to take care of seamless service supply.<\/p>\n<p id=\"1fe6\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">The following diagram exhibits a simplified cloud community topology for cross-region site visitors.<\/p>\n<figure class=\"pd pe pf pg ph pi pa pb paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"pj pk fk pl bg pm\">\n<div class=\"pa pb pc\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*RpHklRseVBeBJG6u 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*RpHklRseVBeBJG6u 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*RpHklRseVBeBJG6u 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*RpHklRseVBeBJG6u 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*RpHklRseVBeBJG6u 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*RpHklRseVBeBJG6u 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*RpHklRseVBeBJG6u 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*RpHklRseVBeBJG6u 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*RpHklRseVBeBJG6u 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*RpHklRseVBeBJG6u 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*RpHklRseVBeBJG6u 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*RpHklRseVBeBJG6u 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*RpHklRseVBeBJG6u 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*RpHklRseVBeBJG6u 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bg mf pn c\" width=\"700\" height=\"447\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"f312\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">Our Cloud Network Engineering on-call staff acquired a request to handle a community challenge affecting an software with cross-region site visitors. Initially, it appeared that the applying was experiencing timeouts, seemingly because of suboptimal community efficiency. As everyone knows, the longer the community path, the extra gadgets the packets traverse, rising the probability of points. For this incident, <strong class=\"na gw\">the shopper software is situated in an inside subnet within the US area whereas the server software is situated in an exterior subnet in a European area<\/strong>. Therefore, it&#8217;s pure in charge the community since packets have to journey lengthy distances by way of the web.<\/p>\n<p id=\"d4f7\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">As community engineers, our preliminary response when the community is blamed is usually, \u201cNo, it can\u2019t be the network,\u201d and our process is to show it. Given that there have been no latest adjustments to the community infrastructure and no reported AWS points impacting different functions, the on-call engineer suspected a loud neighbor challenge and sought help from the Host Network Engineering staff.<\/p>\n<p id=\"de5a\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">In this context, a loud neighbor challenge happens when a container shares a number with different network-intensive containers. <strong class=\"na gw\">These noisy neighbors eat extreme community sources, inflicting different containers on the identical host to undergo from degraded community efficiency. <\/strong>Despite every container having bandwidth limitations, oversubscription can nonetheless result in such points.<\/p>\n<p id=\"f159\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Upon investigating different containers on the identical host \u2014 most of which have been a part of the identical software \u2014 we shortly eradicated the potential for noisy neighbors. <strong class=\"na gw\">The community throughput for each the problematic container and all others was considerably under the set bandwidth limits.<\/strong> We tried to resolve the problem by eradicating these bandwidth limits, permitting the applying to make the most of as a lot bandwidth as crucial. However, the issue endured.<\/p>\n<p id=\"cdc9\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">We noticed some <strong class=\"na gw\">TCP packets within the community marked with the RST flag<\/strong>, a flag indicating {that a} connection needs to be instantly terminated. Although the frequency of those packets was not alarmingly excessive, the presence of any RST packets nonetheless raised suspicion on the community. To decide whether or not this was certainly a network-induced challenge, we carried out a tcpdump on the shopper. In the packet seize file, we noticed one TCP stream that was closed after precisely 30 seconds.<\/p>\n<p id=\"e4ad\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">SYN at 18:47:06<\/p>\n<figure class=\"pd pe pf pg ph pi pa pb paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"pj pk fk pl bg pm\">\n<div class=\"pa pb po\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*ZLnTrJNuCBe4tUry 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*ZLnTrJNuCBe4tUry 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*ZLnTrJNuCBe4tUry 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*ZLnTrJNuCBe4tUry 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*ZLnTrJNuCBe4tUry 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*ZLnTrJNuCBe4tUry 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*ZLnTrJNuCBe4tUry 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*ZLnTrJNuCBe4tUry 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*ZLnTrJNuCBe4tUry 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*ZLnTrJNuCBe4tUry 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*ZLnTrJNuCBe4tUry 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*ZLnTrJNuCBe4tUry 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*ZLnTrJNuCBe4tUry 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*ZLnTrJNuCBe4tUry 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bg mf pn c\" width=\"700\" height=\"59\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"6af6\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">After the 3-way handshake (SYN,SYN-ACK,ACK), the site visitors began flowing usually. Nothing unusual till FIN at 18:47:36 (30 seconds later)<\/p>\n<figure class=\"pd pe pf pg ph pi pa pb paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"pj pk fk pl bg pm\">\n<div class=\"pa pb po\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*0-aCcRviD0JHcngn 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*0-aCcRviD0JHcngn 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*0-aCcRviD0JHcngn 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*0-aCcRviD0JHcngn 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*0-aCcRviD0JHcngn 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*0-aCcRviD0JHcngn 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*0-aCcRviD0JHcngn 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*0-aCcRviD0JHcngn 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*0-aCcRviD0JHcngn 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*0-aCcRviD0JHcngn 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*0-aCcRviD0JHcngn 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*0-aCcRviD0JHcngn 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*0-aCcRviD0JHcngn 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*0-aCcRviD0JHcngn 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bg mf pn c\" width=\"700\" height=\"156\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"f0da\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">The packet seize outcomes clearly indicated that <strong class=\"na gw\">it was the shopper software that initiated the connection termination by sending a FIN packet<\/strong>. Following this, the server continued to ship information; nonetheless, because the shopper had already determined to shut the connection, it responded with RST packets to all subsequent information from the server.<\/p>\n<p id=\"c1f7\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">To be sure that the shopper wasn\u2019t closing the connection because of packet loss, we additionally carried out a packet seize on the server aspect to confirm that every one packets despatched by the server have been acquired. This process was sophisticated by the truth that the packets handed by way of a NAT gateway (NGW), which meant that on the server aspect, the shopper\u2019s IP and port appeared as these of the NGW, differing from these seen on the shopper aspect. Consequently, to precisely match TCP streams, <strong class=\"na gw\">we wanted to establish the TCP stream on the shopper aspect, find the uncooked TCP sequence quantity, after which use this quantity as a filter on the server aspect to search out the corresponding TCP stream.<\/strong><\/p>\n<p id=\"92bd\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">With packet seize outcomes from each the shopper and server sides, we confirmed that <strong class=\"na gw\">all packets despatched by the server have been accurately acquired earlier than the shopper despatched a FIN<\/strong>.<\/p>\n<p id=\"1f6d\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Now, from the community standpoint, the story is obvious. The shopper initiated the connection requesting information from the server. The server saved sending information to the shopper with no downside. However, at a sure level, <strong class=\"na gw\">regardless of the server nonetheless having information to ship, the shopper selected to terminate the reception of knowledge<\/strong>. This led us to suspect that the problem may be associated to the shopper software itself.<\/p>\n<p id=\"e481\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">In order to totally perceive the issue, we now want to grasp how the applying works. As proven within the diagram under, the applying runs within the us-east-1 area. <strong class=\"na gw\">It reads information from cross-region servers and writes the info to shoppers inside the identical area.<\/strong> The shopper runs as containers, whereas the servers are EC2 cases.<\/p>\n<p id=\"e82c\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><strong class=\"na gw\">Notably, the cross-region learn was problematic <\/strong>whereas the write path was easy. Most importantly, there&#8217;s a 30-second application-level timeout for studying the info. The software (shopper) errors out if it fails to learn an preliminary batch of knowledge from the servers inside 30 seconds. When we elevated this timeout to 60 seconds, all the things labored as anticipated. <strong class=\"na gw\">This explains why the shopper initiated a FIN \u2014 as a result of it misplaced persistence ready for the server to switch information<\/strong>.<\/p>\n<figure class=\"pd pe pf pg ph pi pa pb paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"pj pk fk pl bg pm\">\n<div class=\"pa pb pp\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*zNmSGl1_5vtOHETn 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*zNmSGl1_5vtOHETn 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*zNmSGl1_5vtOHETn 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*zNmSGl1_5vtOHETn 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*zNmSGl1_5vtOHETn 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*zNmSGl1_5vtOHETn 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*zNmSGl1_5vtOHETn 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*zNmSGl1_5vtOHETn 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*zNmSGl1_5vtOHETn 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*zNmSGl1_5vtOHETn 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*zNmSGl1_5vtOHETn 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*zNmSGl1_5vtOHETn 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*zNmSGl1_5vtOHETn 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*zNmSGl1_5vtOHETn 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bg mf pn c\" width=\"700\" height=\"300\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div>\n<\/figure>\n<p id=\"da9f\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Could it&#8217;s that the server was up to date to ship information extra slowly? Could it&#8217;s that the shopper software was up to date to obtain information extra slowly? Could it&#8217;s that the info quantity turned too giant to be fully despatched out inside 30 seconds? Sadly, <strong class=\"na gw\">we acquired unfavorable solutions for all 3 questions from the applying proprietor.<\/strong> The server had been working with out adjustments for over a yr, there have been no vital updates within the newest rollout of the shopper, and the info quantity had remained constant.<\/p>\n<p id=\"efa1\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">If each the community and the applying weren\u2019t modified just lately, then what modified? In reality, we found that the problem coincided with a latest <strong class=\"na gw\">Linux kernel improve from model 6.5.13 to six.6.10<\/strong>. To take a look at this speculation, we rolled again the kernel improve and it did restore regular operation to the applying.<\/p>\n<p id=\"9ee2\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Honestly talking, at the moment I didn\u2019t consider it was a kernel bug as a result of I assumed the TCP implementation within the kernel needs to be strong and steady (Spoiler alert: How incorrect was I!). But we have been additionally out of concepts from different angles.<\/p>\n<p id=\"f536\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">There have been about 14k commits between the great and unhealthy kernel variations. Engineers on the staff methodically and diligently bisected between the 2 variations. When the bisecting was narrowed to a few commits, <strong class=\"na gw\">a change with \u201ctcp\u201d in its commit message caught our consideration. The remaining bisecting confirmed that <\/strong><a class=\"af nw\" href=\"https:\/\/lore.kernel.org\/netdev\/20230717152917.751987-1-edumazet@google.com\/T\/\" rel=\"noopener ugc nofollow\" target=\"_blank\"><strong class=\"na gw\">this commit<\/strong><\/a><strong class=\"na gw\"> was our perpetrator<\/strong>.<\/p>\n<p id=\"cc01\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Interestingly, whereas reviewing the e-mail historical past associated to this commit, we discovered that <a class=\"af nw\" href=\"https:\/\/github.com\/eventlet\/eventlet\/issues\/821\" rel=\"noopener ugc nofollow\" target=\"_blank\">one other person had reported a Python take a look at failure following the identical kernel improve<\/a>. Although their answer was indirectly relevant to our state of affairs, it recommended that <strong class=\"na gw\">an easier take a look at may also reproduce our downside<\/strong>. Using <em class=\"pq\">strace<\/em>, we noticed that the applying configured the next socket choices when speaking with the server:<\/p>\n<pre class=\"pd pe pf pg ph pr ps pt bo pu ba bj\"><span id=\"95ec\" class=\"pv ny gv ps b bf pw px l py pz\">[pid 1699] setsockopt(917, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0<br\/>[pid 1699] setsockopt(917, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0<br\/>[pid 1699] setsockopt(917, SOL_SOCKET, SO_SNDBUF, [131072], 4) = 0<br\/>[pid 1699] setsockopt(917, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0<br\/>[pid 1699] setsockopt(917, SOL_TCP, TCP_NODELAY, [1], 4) = 0<\/span><\/pre>\n<p id=\"3e18\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">We then developed a minimal client-server C software that transfers a file from the server to the shopper, with the shopper configuring the identical set of socket choices. During testing, we used a 10M file, which represents the quantity of knowledge sometimes transferred inside 30 seconds earlier than the shopper points a FIN. <strong class=\"na gw\">On the outdated kernel, this cross-region switch accomplished in 22 seconds, whereas on the brand new kernel, it took 39 seconds to complete.<\/strong><\/p>\n<p id=\"1490\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">With the assistance of the minimal replica setup, we have been in the end capable of pinpoint the basis explanation for the issue. In order to grasp the basis trigger, it\u2019s important to have a grasp of the TCP obtain window.<\/p>\n<h2 id=\"6a6e\" class=\"qa ny gv be nz qb qc dx od qd qe dz oh nj qf qg qh nn qi qj qk nr ql qm qn qo bj\">TCP Receive Window<\/h2>\n<p id=\"a997\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">Simply put, <strong class=\"na gw\">the TCP obtain window is how the receiver tells the sender \u201cThis is how many bytes you can send me without me ACKing any of them\u201d<\/strong>. Assuming the sender is the server and the receiver is the shopper, then we&#8217;ve:<\/p>\n<figure class=\"pd pe pf pg ph pi pa pb paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"pj pk fk pl bg pm\">\n<div class=\"pa pb qp\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/format:webp\/0*98gJP81W46nhdonq 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/format:webp\/0*98gJP81W46nhdonq 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/format:webp\/0*98gJP81W46nhdonq 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/format:webp\/0*98gJP81W46nhdonq 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/format:webp\/0*98gJP81W46nhdonq 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/format:webp\/0*98gJP81W46nhdonq 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/format:webp\/0*98gJP81W46nhdonq 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*98gJP81W46nhdonq 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*98gJP81W46nhdonq 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*98gJP81W46nhdonq 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*98gJP81W46nhdonq 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*98gJP81W46nhdonq 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*98gJP81W46nhdonq 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*98gJP81W46nhdonq 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bg mf pn c\" width=\"700\" height=\"456\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div>\n<\/figure>\n<h2 id=\"f567\" class=\"qa ny gv be nz qb qc dx od qd qe dz oh nj qf qg qh nn qi qj qk nr ql qm qn qo bj\">The Window Size<\/h2>\n<p id=\"0d83\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">Now that we all know the TCP obtain window dimension may have an effect on the throughput, the query is, how is the window dimension calculated? As an software author, you may\u2019t resolve the window dimension, nonetheless, you may resolve how a lot reminiscence you wish to use for buffering acquired information. This is configured utilizing <strong class=\"na gw\"><em class=\"pq\">SO_RCVBUF<\/em> socket choice<\/strong> we noticed within the <em class=\"pq\">strace<\/em> consequence above. However, notice that the worth of this selection means how a lot <strong class=\"na gw\">software information<\/strong> could be queued within the obtain buffer. In <a class=\"af nw\" href=\"https:\/\/man7.org\/linux\/man-pages\/man7\/socket.7.html\" rel=\"noopener ugc nofollow\" target=\"_blank\">man 7 socket<\/a>, there may be<\/p>\n<blockquote class=\"qq qr qs\">\n<p id=\"a73f\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">SO_RCVBUF<\/p>\n<p id=\"ed59\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Sets or will get the utmost socket obtain buffer in bytes.<br \/>The kernel doubles this worth (to permit house for<br \/>bookkeeping overhead) when it&#8217;s set utilizing setsockopt(2),<br \/>and this doubled worth is returned by getsockopt(2). The<br \/>default worth is about by the<br \/>\/proc\/sys\/internet\/core\/rmem_default file, and the utmost<br \/>allowed worth is about by the \/proc\/sys\/internet\/core\/rmem_max<br \/>file. The minimal (doubled) worth for this selection is 256.<\/p>\n<\/blockquote>\n<p id=\"71af\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">This means, when the person provides a worth X, then <a class=\"af nw\" href=\"https:\/\/elixir.bootlin.com\/linux\/v6.9-rc1\/source\/net\/core\/sock.c#L976\" rel=\"noopener ugc nofollow\" target=\"_blank\">the kernel shops 2X within the variable sk-&gt;sk_rcvbuf<\/a>. In different phrases, <strong class=\"na gw\">the kernel assumes that the bookkeeping overhead is as a lot because the precise information (i.e. 50% of the sk_rcvbuf)<\/strong>.<\/p>\n<h2 id=\"7337\" class=\"qa ny gv be nz qb qc dx od qd qe dz oh nj qf qg qh nn qi qj qk nr ql qm qn qo bj\">sysctl_tcp_adv_win_scale<\/h2>\n<p id=\"9a3d\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">However, the belief above is probably not true as a result of the precise overhead actually is dependent upon a number of components resembling Maximum Transmission Unit (MTU). Therefore, <strong class=\"na gw\">the kernel supplied this <em class=\"pq\">sysctl_tcp_adv_win_scale<\/em> which you need to use to inform the kernel what the precise overhead is<\/strong>. (I consider 99% of individuals additionally don\u2019t know find out how to set this parameter accurately and I\u2019m undoubtedly considered one of them. You\u2019re the kernel, for those who don\u2019t know the overhead, how will you anticipate me to know?).<\/p>\n<p id=\"4e58\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">According to <a class=\"af nw\" href=\"https:\/\/docs.kernel.org\/networking\/ip-sysctl.html\" rel=\"noopener ugc nofollow\" target=\"_blank\">the <em class=\"pq\">sysctl<\/em> doc<\/a>,<\/p>\n<blockquote class=\"qq qr qs\">\n<p id=\"23e2\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><em class=\"gv\">tcp_adv_win_scale \u2014 INTEGER<\/em><\/p>\n<p id=\"a18b\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><em class=\"gv\">Obsolete since linux-6.6 Count buffering overhead as bytes\/2^tcp_adv_win_scale (if tcp_adv_win_scale &gt; 0) or bytes-bytes\/2^(-tcp_adv_win_scale), whether it is &lt;= 0.<\/em><\/p>\n<p id=\"0786\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><em class=\"gv\">Possible values are [-31, 31], inclusive.<\/em><\/p>\n<p id=\"2f88\" class=\"my mz pq na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\"><em class=\"gv\">Default: 1<\/em><\/p>\n<\/blockquote>\n<p id=\"ebc0\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">For 99% of individuals, we\u2019re simply utilizing the default worth 1, which in flip means the overhead is calculated by <em class=\"pq\">rcvbuf\/2^tcp_adv_win_scale = 1\/2 * rcvbuf<\/em>. This matches the belief when setting the <em class=\"pq\">SO_RCVBUF<\/em> worth.<\/p>\n<p id=\"289d\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Let\u2019s recap. Assume you set <em class=\"pq\">SO_RCVBUF<\/em> to 65536, which is the worth set by the applying as proven within the <em class=\"pq\">setsockopt<\/em> syscall. Then we&#8217;ve:<\/p>\n<ul class=\"\">\n<li id=\"e79f\" class=\"my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv qt qu qv bj\">SO_RCVBUF = 65536<\/li>\n<li id=\"72c0\" class=\"my mz gv na b nb qw nd ne nf qx nh ni nj qy nl nm nn qz np nq nr ra nt nu nv qt qu qv bj\">rcvbuf = 2 * 65536 = 131072<\/li>\n<li id=\"e031\" class=\"my mz gv na b nb qw nd ne nf qx nh ni nj qy nl nm nn qz np nq nr ra nt nu nv qt qu qv bj\">overhead = rcvbuf \/ 2 = 131072 \/ 2 = 65536<\/li>\n<li id=\"7e11\" class=\"my mz gv na b nb qw nd ne nf qx nh ni nj qy nl nm nn qz np nq nr ra nt nu nv qt qu qv bj\">obtain window dimension = rcvbuf \u2014 overhead = 131072\u201365536 = 65536<\/li>\n<\/ul>\n<p id=\"6c6d\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">(Note, this calculation is simplified. The actual calculation is extra complicated.)<\/p>\n<p id=\"4e0b\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">In brief, the obtain window dimension earlier than the kernel improve was 65536. With this window dimension, the applying was capable of switch 10M information inside 30 seconds.<\/p>\n<h2 id=\"0ac0\" class=\"qa ny gv be nz qb qc dx od qd qe dz oh nj qf qg qh nn qi qj qk nr ql qm qn qo bj\">The Change<\/h2>\n<p id=\"8abe\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\"><a class=\"af nw\" href=\"https:\/\/lore.kernel.org\/netdev\/20230717152917.751987-1-edumazet@google.com\/T\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">This commit<\/a> obsoleted <em class=\"pq\">sysctl_tcp_adv_win_scale<\/em> and launched a <em class=\"pq\">scaling_ratio<\/em> that may extra precisely calculate the overhead or window dimension, which is the best factor to do. With the change, the window dimension is now <em class=\"pq\">rcvbuf * scaling_ratio<\/em>.<\/p>\n<p id=\"9bdf\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">So how is <em class=\"pq\">scaling_ratio<\/em> calculated? It is calculated utilizing <strong class=\"na gw\"><em class=\"pq\">skb-&gt;len\/skb-&gt;truesize<\/em><\/strong> the place <em class=\"pq\">skb-&gt;len<\/em> is the size of the tcp information size in an <em class=\"pq\">skb<\/em> and <em class=\"pq\">truesize<\/em> is the entire dimension of the <em class=\"pq\">skb<\/em>. <strong class=\"na gw\">This is unquestionably a extra correct ratio primarily based on actual information reasonably than a hardcoded 50%.<\/strong> Now, right here is the following query: throughout the TCP handshake <strong class=\"na gw\">earlier than any information is transferred, how can we resolve the preliminary <em class=\"pq\">scaling_ratio<\/em>? <\/strong>The reply is, a magic and conservative ratio was chosen with the worth being roughly 0.25.<\/p>\n<p id=\"f7cb\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Now we&#8217;ve:<\/p>\n<ul class=\"\">\n<li id=\"5f22\" class=\"my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv qt qu qv bj\">SO_RCVBUF = 65536<\/li>\n<li id=\"3e62\" class=\"my mz gv na b nb qw nd ne nf qx nh ni nj qy nl nm nn qz np nq nr ra nt nu nv qt qu qv bj\">rcvbuf = 2 * 65536 = 131072<\/li>\n<li id=\"88c1\" class=\"my mz gv na b nb qw nd ne nf qx nh ni nj qy nl nm nn qz np nq nr ra nt nu nv qt qu qv bj\">obtain window dimension = rcvbuf * 0.25 = 131072 * 0.25 = 32768<\/li>\n<\/ul>\n<p id=\"79ea\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">In brief, <strong class=\"na gw\">the obtain window dimension halved after the kernel improve. Hence the throughput was reduce in half<\/strong>,<strong class=\"na gw\"> inflicting the info switch time to double.<\/strong><\/p>\n<p id=\"de39\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Naturally, you might ask, I perceive that the preliminary window dimension is small, however <strong class=\"na gw\">why doesn\u2019t the window develop when we&#8217;ve a extra correct ratio of the payload later<\/strong> (i.e. <em class=\"pq\">skb-&gt;len\/skb-&gt;truesize<\/em>)? With some debugging, we ultimately discovered that the <em class=\"pq\">scaling_ratio<\/em> does <a class=\"af nw\" href=\"https:\/\/elixir.bootlin.com\/linux\/v6.7.9\/source\/net\/ipv4\/tcp_input.c#L248\" rel=\"noopener ugc nofollow\" target=\"_blank\">get up to date to a extra correct <em class=\"pq\">skb-&gt;len\/skb-&gt;truesize<\/em><\/a>, which in our case is round 0.66. However, one other variable, <em class=\"pq\">window_clamp<\/em>, just isn&#8217;t up to date accordingly. <em class=\"pq\">window_clamp<\/em> is the <a class=\"af nw\" href=\"https:\/\/elixir.bootlin.com\/linux\/v6.7.9\/source\/include\/linux\/tcp.h#L256\" rel=\"noopener ugc nofollow\" target=\"_blank\">most obtain window allowed to be marketed<\/a>, which can also be initialized to <em class=\"pq\">0.25 * rcvbuf <\/em>utilizing the preliminary <em class=\"pq\">scaling_ratio<\/em>. As a consequence, <strong class=\"na gw\">the obtain window dimension is capped at this worth and might\u2019t develop larger<\/strong>.<\/p>\n<p id=\"e049\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">In idea, the repair is to replace <em class=\"pq\">window_clamp<\/em> together with <em class=\"pq\">scaling_ratio<\/em>. However, to be able to have a easy repair that doesn\u2019t introduce different surprising behaviors, <a class=\"af nw\" href=\"https:\/\/git.kernel.org\/pub\/scm\/linux\/kernel\/git\/netdev\/net-next.git\/commit\/?id=697a6c8cec03\" rel=\"noopener ugc nofollow\" target=\"_blank\">our remaining repair was to extend the preliminary <em class=\"pq\">scaling_ratio<\/em> from 25% to 50%<\/a>. This will make the obtain window dimension backward suitable with the unique default <em class=\"pq\">sysctl_tcp_adv_win_scale<\/em>.<\/p>\n<p id=\"fb5c\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">Meanwhile, discover that the issue just isn&#8217;t solely brought on by the modified kernel habits but additionally by the truth that the applying units <em class=\"pq\">SO_RCVBUF<\/em> and has a 30-second application-level timeout. In reality, the applying is Kafka Connect and each settings are the default configurations (<a class=\"af nw\" href=\"https:\/\/kafka.apache.org\/documentation\/#connectconfigs_receive.buffer.bytes\" rel=\"noopener ugc nofollow\" target=\"_blank\"><em class=\"pq\">obtain.buffer.bytes=64k<\/em><\/a> and <a class=\"af nw\" href=\"https:\/\/kafka.apache.org\/documentation\/#consumerconfigs_request.timeout.ms\" rel=\"noopener ugc nofollow\" target=\"_blank\"><em class=\"pq\">request.timeout.ms=30s<\/em><\/a>). We additionally<a class=\"af nw\" href=\"https:\/\/issues.apache.org\/jira\/browse\/KAFKA-16496\" rel=\"noopener ugc nofollow\" target=\"_blank\"> created a kafka ticket to alter obtain.buffer.bytes to -1<\/a> to permit Linux to auto tune the obtain window.<\/p>\n<p id=\"0152\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">This was a really fascinating debugging train that lined many layers of Netflix\u2019s stack and infrastructure. While it technically wasn\u2019t the \u201cnetwork\u201d in charge, this time it turned out the perpetrator was the software program elements that make up the community (i.e. the TCP implementation within the kernel).<\/p>\n<p id=\"d2c0\" class=\"pw-post-body-paragraph my mz gv na b nb nc nd ne nf ng nh ni nj nk nl nm nn no np nq nr ns nt nu nv go bj\">If tackling such technical challenges excites you, take into account becoming a member of our Cloud Infrastructure Engineering groups. Explore alternatives by visiting <a class=\"af nw\" href=\"https:\/\/jobs.netflix.com\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Netflix Jobs<\/a> and looking for Cloud Engineering positions.<\/p>\n<p id=\"c3e7\" class=\"pw-post-body-paragraph my mz gv na b nb ov nd ne nf ow nh ni nj ox nl nm nn oy np nq nr oz nt nu nv go bj\">Special due to our gorgeous colleagues <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/alok-tiagi-99205015\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Alok Tiagi<\/a>, <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/artemtkachuk\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Artem Tkachuk<\/a>, <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/jethanadams\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Ethan Adams<\/a>, <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/jorge-rodriguez-12b5595\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Jorge Rodriguez<\/a>, <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/nickmahilani\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Nick Mahilani<\/a>, <a class=\"af nw\" href=\"https:\/\/tycho.pizza\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Tycho Andersen<\/a> and <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/vinay-rayini\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Vinay Rayini<\/a> for investigating and mitigating this challenge. We would additionally prefer to thank Linux kernel community knowledgeable <a class=\"af nw\" href=\"https:\/\/www.linkedin.com\/in\/eric-dumazet-ba252942\/\" rel=\"noopener ugc nofollow\" target=\"_blank\">Eric Dumazet<\/a> for reviewing and making use of the patch.<\/p>\n<\/div>\n<p>[ad_2]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] 10 min learn \u00b7 Apr 24, 2024 Hechao Li, Roger Cruz Netflix operates a extremely environment friendly cloud computing infrastructure that helps a wide selection of functions important for our SVOD (Subscription Video on Demand), stay streaming and gaming companies. Utilizing Amazon AWS, our infrastructure is hosted throughout a number of geographic areas worldwide. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":133035,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[955,5951,2502,5952,115,2434,449,4337],"class_list":{"0":"post-133033","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-netflix","8":"tag-blog","9":"tag-crossregional","10":"tag-investigation","11":"tag-issue","12":"tag-netflix","13":"tag-network","14":"tag-performance","15":"tag-technology"},"_links":{"self":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/133033","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/comments?post=133033"}],"version-history":[{"count":0,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/133033\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media\/133035"}],"wp:attachment":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media?parent=133033"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/categories?post=133033"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/tags?post=133033"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}