{"id":78946,"date":"2023-03-07T23:29:23","date_gmt":"2023-03-07T23:29:23","guid":{"rendered":"https:\/\/showbizztoday.com\/index.php\/2023\/03\/07\/data-ingestion-pipeline-with-operation-management-marken\/"},"modified":"2023-03-07T23:29:23","modified_gmt":"2023-03-07T23:29:23","slug":"data-ingestion-pipeline-with-operation-management-marken","status":"publish","type":"post","link":"https:\/\/showbizztoday.com\/index.php\/2023\/03\/07\/data-ingestion-pipeline-with-operation-management-marken\/","title":{"rendered":"Data ingestion pipeline with Operation Management (Marken)"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<p id=\"bfc8\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">At Netflix, to advertise and advocate the content material to customers in the very best approach there are lots of Media Algorithm groups which work hand in hand with content material creators and editors. Several of those algorithms goal to enhance completely different handbook workflows in order that we present the customized promotional picture, trailer or the present to the person.<\/p>\n<p id=\"7e2a\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">These media targeted machine studying algorithms in addition to different groups generate quite a lot of information from the media recordsdata, which we described in our <a class=\"ae ke\" rel=\"noopener ugc nofollow\" target=\"_blank\" href=\"https:\/\/netflixtechblog.com\/scalable-annotation-service-marken-f5ba9266d428\">earlier weblog<\/a>, are saved as annotations in Marken. We designed a novel idea referred to as Annotation Operations which permits groups to create information pipelines and simply write annotations with out worrying about entry patterns of their information from completely different purposes.<\/p>\n<figure class=\"lz ma mb mc gs md gg gh paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"me mf di mg bf mh\">\n<div class=\"gg gh ly\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*c7aT8gHQfxOWmcdt 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*c7aT8gHQfxOWmcdt 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*c7aT8gHQfxOWmcdt 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*c7aT8gHQfxOWmcdt 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*c7aT8gHQfxOWmcdt 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*c7aT8gHQfxOWmcdt 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*c7aT8gHQfxOWmcdt 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*c7aT8gHQfxOWmcdt 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*c7aT8gHQfxOWmcdt 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*c7aT8gHQfxOWmcdt 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*c7aT8gHQfxOWmcdt 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*c7aT8gHQfxOWmcdt 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*c7aT8gHQfxOWmcdt 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*c7aT8gHQfxOWmcdt 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bf mi mj c\" width=\"700\" height=\"412\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div><figcaption class=\"mk ml gi gg gh mm mn bd b be z dk\">Annotation Operations<\/figcaption><\/figure>\n<p id=\"ce2b\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Lets decide an instance use case of figuring out objects (like bushes, vehicles and so forth.) in a video file. As described within the above image<\/p>\n<ul class=\"\">\n<li id=\"b4a6\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">During the primary run of the algorithm it recognized 500 objects in a specific Video file. These 500 objects had been saved as annotations of a particular schema sort, let\u2019s say Objects, in Marken.<\/li>\n<li id=\"ba2a\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">The Algorithm crew improved their algorithm. Now once we re-ran the algorithm on the identical video file it created 600 annotations of schema sort Objects and saved them in our service.<\/li>\n<\/ul>\n<p id=\"a4a5\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Notice that we can not replace the annotations from earlier runs as a result of we don\u2019t know what number of annotations a brand new algorithm run will end result into. It can be very costly for us to maintain monitor of which annotation must be up to date.<\/p>\n<p id=\"7944\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">The objective is that when the buyer comes and searches for annotations of sort Objects for the given video file then the next ought to occur.<\/p>\n<ul class=\"\">\n<li id=\"56f2\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">Before Algo run 1, in the event that they search they need to not discover something.<\/li>\n<li id=\"d02a\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">After the completion of Algo run 1, the question ought to discover the primary set of 500 annotations.<\/li>\n<li id=\"7e98\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">During the time when Algo run 2 was creating the set of 600 annotations, shoppers search ought to nonetheless return the older 500 annotations.<\/li>\n<li id=\"a025\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">When the entire 600 annotations are efficiently created, they need to change the older set of 500.<\/li>\n<li id=\"0774\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">So now when shoppers search annotations for Objects then they need to get 600 annotations.<\/li>\n<\/ul>\n<p id=\"115b\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Does this remind you of one thing? This appears very related (not precisely identical) to a distributed transaction.<\/p>\n<p id=\"5f5f\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Typically, an algorithm run can have 2k-5k annotations. There are many naive options attainable for this downside for instance:<\/p>\n<ul class=\"\">\n<li id=\"ff66\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">Write completely different runs in numerous databases. This is clearly very costly.<\/li>\n<li id=\"d4dc\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">Write algo runs into recordsdata. But we can not search or current low latency retrievals from recordsdata<\/li>\n<li id=\"2528\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">Etc.<\/li>\n<\/ul>\n<p id=\"c131\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Instead our problem was to implement this function on high of Cassandra and ElasticSearch databases as a result of that\u2019s what Marken makes use of. The resolution which we current on this weblog shouldn&#8217;t be restricted to annotations and can be utilized for another area which makes use of ES and Cassandra as properly.<\/p>\n<p id=\"2bee\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Marken\u2019s structure diagram is as follows. We refer the reader to our earlier <a class=\"ae ke\" rel=\"noopener ugc nofollow\" target=\"_blank\" href=\"https:\/\/netflixtechblog.com\/scalable-annotation-service-marken-f5ba9266d428\">weblog article<\/a> for particulars. We use Cassandra as a supply of reality the place we retailer the annotations whereas we index annotations in ElasticSearch to supply wealthy search functionalities.<\/p>\n<figure class=\"lz ma mb mc gs md gg gh paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"me mf di mg bf mh\">\n<div class=\"gg gh nc\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*YNagRtOKICtNNMg2 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*YNagRtOKICtNNMg2 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*YNagRtOKICtNNMg2 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*YNagRtOKICtNNMg2 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*YNagRtOKICtNNMg2 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*YNagRtOKICtNNMg2 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*YNagRtOKICtNNMg2 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*YNagRtOKICtNNMg2 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*YNagRtOKICtNNMg2 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*YNagRtOKICtNNMg2 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*YNagRtOKICtNNMg2 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*YNagRtOKICtNNMg2 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*YNagRtOKICtNNMg2 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*YNagRtOKICtNNMg2 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bf mi mj c\" width=\"700\" height=\"321\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div><figcaption class=\"mk ml gi gg gh mm mn bd b be z dk\">Marken Architecture<\/figcaption><\/figure>\n<p id=\"7884\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Our objective was to assist groups at Netflix to create information pipelines with out fascinated with how that information is offered to the readers or the shopper groups. Similarly, shopper groups don\u2019t have to fret about when or how the information is written. This is what we name decoupling producer flows from shoppers of the information.<\/p>\n<p id=\"3c8c\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Lifecycle of a film goes by way of quite a lot of inventive levels. We have many short-term recordsdata that are delivered earlier than we get to the ultimate file of the film. Similarly, a film has many various languages and every of these languages can have completely different recordsdata delivered. Teams usually need to run algorithms and create annotations utilizing all these media recordsdata.<\/p>\n<p id=\"a280\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Since algorithms may be run on a special permutations of how the media recordsdata are created and delivered we are able to simplify an algorithm run as follows<\/p>\n<ul class=\"\">\n<li id=\"a168\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">Annotation Schema Type \u2014 identifies the schema for the annotation generated by the Algorithm.<\/li>\n<li id=\"b560\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">Annotation Schema Version \u2014 identifies the schema model of the annotation generated by the Algorithm.<\/li>\n<li id=\"1909\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">PivotId \u2014 a novel string identifier which identifies the file or methodology which is used to generate the annotations. This could possibly be the SHA hash of the file or just the film Identifier quantity.<\/li>\n<\/ul>\n<p id=\"abb8\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Given above we are able to describe the information mannequin for an annotation operation as follows.<\/p>\n<pre class=\"lz ma mb mc gs nd ne nf bn ng nh bi\"><span id=\"9114\" class=\"ni kg ip ne b be nj nk l nl nm\">{<br\/>\"annotationOperationKeys\": [<br\/>{<br\/>\"annotationType\": \"string\",   \u2776<br\/>\"annotationTypeVersion\": \u201cinteger\u201d,<br\/>\"pivotId\": \"string\",<br\/>\"operationNumber\": \u201cinteger\u201d    \u2777<br\/>}<br\/>],<br\/>\"id\": \"UUID\",<br\/>\"operationStatus\": \"STARTED\",   \u2778<br\/>\"isActive\": true   \u2779<br\/>}<\/span><\/pre>\n<ol class=\"\">\n<li id=\"80f7\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls nn mu mv mw bi\">We already defined AnnotationType, AnnotationTypeVersion and PivotId above.<\/li>\n<li id=\"8348\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls nn mu mv mw bi\">OperationQuantity is an auto incremented quantity for every new operation.<\/li>\n<li id=\"4fdf\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls nn mu mv mw bi\">OperationStanding \u2014 An operation goes by way of three phases, Started, Finished and Canceled.<\/li>\n<li id=\"cf9a\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls nn mu mv mw bi\">IsActive \u2014 Whether an operation and its related annotations are energetic and searchable.<\/li>\n<\/ol>\n<p id=\"0cd1\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">As you&#8217;ll be able to see from the information mannequin that the producer of an annotation has to decide on an AnnotationOperationKey which lets them outline how they need UPSERT annotations in an AnnotationOperation. Inside, AnnotationOperationKey the essential subject is pivotId and the way it&#8217;s generated.<\/p>\n<p id=\"5a25\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Our supply of reality for all objects in Marken in Cassandra. To retailer Annotation Operations we&#8217;ve got the next essential tables.<\/p>\n<ul class=\"\">\n<li id=\"63ba\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">AnnotationOperationById \u2014 It shops the AnnotationOperations<\/li>\n<li id=\"adeb\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">AnnotationIdByAnnotationOperationId \u2014 it shops the Ids of all annotations in an operation.<\/li>\n<\/ul>\n<p id=\"6e75\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Since Cassandra is NoSql, we&#8217;ve got extra tables which assist us create reverse indices and run admin jobs in order that we are able to scan all annotation operations at any time when there&#8217;s a want.<\/p>\n<p id=\"024e\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Each annotation in Marken can be listed in ElasticSearch for powering numerous searches. To report the connection between annotation and operation we additionally index two fields<\/p>\n<ul class=\"\">\n<li id=\"7c7c\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">annotationOperationId \u2014 The ID of the operation to which this annotation belongs<\/li>\n<li id=\"e81d\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">isAnnotationOperationEnergetic \u2014 Whether the operation is in an ACTIVE state.<\/li>\n<\/ul>\n<p id=\"af87\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">We present three APIs to our customers. In following sections we describe the APIs and the state administration accomplished inside the APIs.<\/p>\n<h2 id=\"cf5c\" class=\"no kg ip bd kh np nq dn kl nr ns dp kp lg nt nu kr lk nv nw kt lo nx ny kv nz bi\"><strong class=\"ak\">BeginAnnotationOperation<\/strong><\/h2>\n<p id=\"bcb7\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">When this API is known as we retailer the operation with its OperationKey (tuple of annotationType, annotationType Version and pivotId) in our database. This new operation is marked to be in STARTED state. We retailer all OperationIDs that are in STARTED state in a distributed cache (EVCache) for quick entry throughout searches.<\/p>\n<figure class=\"lz ma mb mc gs md gg gh paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"me mf di mg bf mh\">\n<div class=\"gg gh oa\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*8fGCkjFSDZosgCXR 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*8fGCkjFSDZosgCXR 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*8fGCkjFSDZosgCXR 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*8fGCkjFSDZosgCXR 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*8fGCkjFSDZosgCXR 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*8fGCkjFSDZosgCXR 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*8fGCkjFSDZosgCXR 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*8fGCkjFSDZosgCXR 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*8fGCkjFSDZosgCXR 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*8fGCkjFSDZosgCXR 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*8fGCkjFSDZosgCXR 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*8fGCkjFSDZosgCXR 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*8fGCkjFSDZosgCXR 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*8fGCkjFSDZosgCXR 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bf mi mj c\" width=\"700\" height=\"67\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div><figcaption class=\"mk ml gi gg gh mm mn bd b be z dk\">BeginAnnotationOperation<\/figcaption><\/figure>\n<h2 id=\"567b\" class=\"no kg ip bd kh np nq dn kl nr ns dp kp lg nt nu kr lk nv nw kt lo nx ny kv nz bi\"><strong class=\"ak\">UpsertAnnotationsInOperation<\/strong><\/h2>\n<p id=\"e202\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Users name this API to upsert the annotations in an Operation. They move annotations together with the OperationID. We retailer the annotations and in addition report the connection between the annotation IDs and the Operation ID in Cassandra. During this section operations are in isAnnotationOperationEnergetic = ACTIVE and operationStatus = STARTED state.<\/p>\n<p id=\"0430\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">Note that usually in a single operation run there may be 2K to 5k annotations which may be created. Clients can name this API from many various machines or threads for quick upserts.<\/p>\n<figure class=\"lz ma mb mc gs md gg gh paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"me mf di mg bf mh\">\n<div class=\"gg gh ob\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*XDc2-pXj6B6Yyfcq 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*XDc2-pXj6B6Yyfcq 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*XDc2-pXj6B6Yyfcq 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*XDc2-pXj6B6Yyfcq 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*XDc2-pXj6B6Yyfcq 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*XDc2-pXj6B6Yyfcq 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*XDc2-pXj6B6Yyfcq 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*XDc2-pXj6B6Yyfcq 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*XDc2-pXj6B6Yyfcq 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*XDc2-pXj6B6Yyfcq 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*XDc2-pXj6B6Yyfcq 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*XDc2-pXj6B6Yyfcq 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*XDc2-pXj6B6Yyfcq 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*XDc2-pXj6B6Yyfcq 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bf mi mj c\" width=\"700\" height=\"96\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div><figcaption class=\"mk ml gi gg gh mm mn bd b be z dk\">UpsertAnnotationsInOperation<\/figcaption><\/figure>\n<h2 id=\"07da\" class=\"no kg ip bd kh np nq dn kl nr ns dp kp lg nt nu kr lk nv nw kt lo nx ny kv nz bi\"><strong class=\"ak\">EndAnnotationOperation<\/strong><\/h2>\n<p id=\"be9f\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Once the annotations have been created in an operation shoppers name EndAnnotationOperation which adjustments following<\/p>\n<ul class=\"\">\n<li id=\"d969\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">Marks the present operation (let\u2019s say with ID2) to be operationStatus = FINISHED and isAnnotationOperationEnergetic=ACTIVE.<\/li>\n<li id=\"9c84\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">We take away the ID2 from the Memcache since it isn&#8217;t in STARTED state.<\/li>\n<li id=\"b25c\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">Any earlier operation (let\u2019s say with ID1) which was ACTIVE is now marked isAnnotationOperationEnergetic=FALSE in Cassandra.<\/li>\n<li id=\"b738\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">Finally, we name updateByQuery API in ElasticSearch. This API finds all Elasticsearch paperwork with ID1 and marks isAnnotationOperationEnergetic=FALSE.<\/li>\n<\/ul>\n<figure class=\"lz ma mb mc gs md gg gh paragraph-image\">\n<div role=\"button\" tabindex=\"0\" class=\"me mf di mg bf mh\">\n<div class=\"gg gh oc\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*7ZDcZBaKK264cIAh 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*7ZDcZBaKK264cIAh 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*7ZDcZBaKK264cIAh 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*7ZDcZBaKK264cIAh 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*7ZDcZBaKK264cIAh 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*7ZDcZBaKK264cIAh 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*7ZDcZBaKK264cIAh 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" type=\"image\/webp\"\/><source data-testid=\"og\" srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*7ZDcZBaKK264cIAh 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*7ZDcZBaKK264cIAh 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*7ZDcZBaKK264cIAh 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*7ZDcZBaKK264cIAh 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*7ZDcZBaKK264cIAh 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*7ZDcZBaKK264cIAh 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*7ZDcZBaKK264cIAh 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\"\/><img alt=\"\" class=\"bf mi mj c\" width=\"700\" height=\"156\" loading=\"lazy\" role=\"presentation\"\/><\/picture><\/div>\n<\/div><figcaption class=\"mk ml gi gg gh mm mn bd b be z dk\">EndAnnotationOperation<\/figcaption><\/figure>\n<h2 id=\"9380\" class=\"no kg ip bd kh np nq dn kl nr ns dp kp lg nt nu kr lk nv nw kt lo nx ny kv nz bi\"><strong class=\"ak\">Search API<\/strong><\/h2>\n<p id=\"84fb\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">This is the important thing half for our readers. When a shopper calls our search API we should exclude<\/p>\n<ul class=\"\">\n<li id=\"30a2\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls mt mu mv mw bi\">any annotations that are from isAnnotationOperationEnergetic=FALSE operations or<\/li>\n<li id=\"6128\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls mt mu mv mw bi\">for which Annotation operations are presently in STARTED state. We try this by excluding the next from all queries in our system.<\/li>\n<\/ul>\n<p id=\"08e9\" class=\"pw-post-body-paragraph kx ky ip kz b la lt jq lc ld lu jt lf lg lv li lj lk lw lm ln lo lx lq lr ls ii bi\">To obtain above<\/p>\n<ol class=\"\">\n<li id=\"0478\" class=\"mo mp ip kz b la lt ld lu lg mq lk mr lo ms ls nn mu mv mw bi\">We add a filter in our ES question to exclude isAnnotationOperationStanding is FALSE.<\/li>\n<li id=\"4b0f\" class=\"mo mp ip kz b la mx ld my lg mz lk na lo nb ls nn mu mv mw bi\">We question EVCache to seek out out all operations that are in STARTED state. Then we exclude all these annotations with annotationId present in memcache. Using memcache permits us to maintain latencies for our search low (most of our queries are lower than 100ms).<\/li>\n<\/ol>\n<p id=\"ff54\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">Cassandra is our supply of reality so if an error occurs we fail the shopper name. However, as soon as we decide to Cassandra we should deal with Elasticsearch errors. In our expertise, all errors have occurred when the Elasticsearch database is having some challenge. In the above case, we created a retry logic for updateByQuery calls to ElasticSearch. If the decision fails we push a message to SQS so we are able to retry in an automatic style after some interval.<\/p>\n<p id=\"ba79\" class=\"pw-post-body-paragraph kx ky ip kz b la lb jq lc ld le jt lf lg lh li lj lk ll lm ln lo lp lq lr ls ii bi\">In close to time period, we need to write a excessive stage abstraction single API which may be referred to as by our shoppers as a substitute of calling three APIs. For instance, they&#8217;ll retailer the annotations in a blob storage like S3 and provides us a hyperlink to the file as a part of the only API.<\/p>\n<\/div>\n<p>[ad_2]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] At Netflix, to advertise and advocate the content material to customers in the very best approach there are lots of Media Algorithm groups which work hand in hand with content material creators and editors. Several of those algorithms goal to enhance completely different handbook workflows in order that we present the customized promotional picture, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":78948,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[],"class_list":{"0":"post-78946","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-netflix"},"_links":{"self":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/78946","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/comments?post=78946"}],"version-history":[{"count":0,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/posts\/78946\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media\/78948"}],"wp:attachment":[{"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/media?parent=78946"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/categories?post=78946"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/showbizztoday.com\/index.php\/wp-json\/wp\/v2\/tags?post=78946"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}