BLOG

How to get started with NPL?

NPLANG.org is hosted to enable open community on Network Programming language. NPL Specification can be downloaded from here. This also gives us a pointer to github where NPL compiler Suite and few examples/applications are posted.

Nplang github has 3 major repositories.

NPL Tutorials
NPL Example applications
NPL Specification

NPL Tutorials repository has

Details on how to setup your own working environment. . To accelerate users Hands-on experience, we have prepared a complete test environment in a virtual machine. Instructions on how to download Virtual Appliance file and also Virtualbox to host virtual appliance file are shared in this GitHub page.

This repository also contains NPL Tidbits.

NPL tidbits are designed to be simple and comprehensive to have users get hands-on experience on NPL constructs, which are fundamental components of NPL language. Each example in NPL Tidits is designed to focus on specific constructs.

Second repository in the nplang github page is NPL example applications, where we provided couple of NPL example applications to demonstrate Layer-2 and Layer-3 functionality on an ethernet switch using NPL language constructs. Examples help users understand how an NPL program can be written to parse and forward packets based on match action resolutions. Test environment discussed before will help users to get hands-on experience on the compilation and verification workflow using NPL Language Compiler and Behavior Model.

Now lets see how to setup a test environment using links provided in NPL Tutorials Github page

Step – 1: Download VirtualBox

Step – 2: Download Virtual appliance file

Step – 3: Import virtual Appliance File using VirtualBox Manager and power on appliance.

Step – 4: Enter password as “npl” all in lower cases

Let’s take a quick look at directory structure in the test environment

Both NPL tidbits and sample examples are placed in examples under the npl directory at this path. Let’s take a look at the bus construct example and its folder structure.

Each example consists of 4 directories

npl folder will have main NPL program
bm_tests folder consists of packets used to test and also special functions needed for the NPL program
Makefile for compiling NPL program
README

Let’s look at the README file

This file contains

A brief description of the example
Contents of the example explaining the location of the main NPL program and its helper NPL code. bm_tests directory and where to find the test packet and table configurations that’s been used to verify the NPL program.
Detailed Steps on how to compile and run the program
What to expect when you verify the NPL program using packets.

Let us take the Strength Resolution Construct example to compile and test using the working environment.

In NPL program, multiple tables may be looked up in parallel. When multiple tables assign the same object as an action, a mechanism is needed to resolve the conflict and derive final value to use. NPL uses a strength-based mechanism for such resolution when a numerical comparison is sufficient to decide the winning object.

Lets take a look at the NPL example that was given in this working environment to understand how this strength resolution works. Every example that we have written we have added a header for users to understand the purpose of the example. We also added a topology diagram along with verification methods used to verify the NPL program.

This example illustrates a strength-based mechanism to decide a winning object of cos-value-assignment when multiple tables can assign cos-value. In this NPL program there are three sources for cos assignment:

‘Local_bus‘ is carrying default cos assignment
‘pri_cos_mapping‘ logical table assigns cos value based on 802.1p value(vlan_priority)
‘dscp_cos_mapping‘ logical table assigns cos value based on dscp value in IP header

Local_bus provides fixed strength_index as default strength_index. Strength_index are obtained by lookup results of pri_cos_mapping_table & dscp_cos_mapping_table. Each entry of these tables can assign different values of strength_index dynamically as a runtime configuration.

strength_resolution() construct resolves the winning object and output the final cos value to local_bus.

Verification involves four key steps.

Compile NPL program using NLC (i.e, NPL Language compiler) & bmgen (i.e, behavior model generator)
Run the behavior model generated from the NPL Program.
Populate logical tables of behavior model using BMCLI. BMCLI is an interface to populate table run time.

Craft & send packets to the behavior model. For processed packets received from behavior model, verify forwarding & packet modifications based on NPL program.

Lets see the step-by-step process involved to compile and execute an example program.

1. Les get into bash, root password is “npl” all lower cases <type “Sudo bash” and “npl”>
2. Lets go to ncsc-1.3.3rc4 <type “cd ~/ncsc-1.3.3rc4”>
3. Source setup script <type source ./bin/setup.sh>
4. Export environment variables as needed. These are documented in the github and also in readme files
<type export NPL_EXAMPLES=/home/npl/ncsc-1.3.3rc4/examples/npl_tidbits/constructs>
<cd $NPL_EXAMPLES/strength/strength_cos>
5. Now lets compile stregth_cos NPL program using Front end compiler. This step will help resolve if there are any syntax & semantics issues with NPL program <type “make fe_nplsim”>
6. Now lets build a behavior model to verify NPL program compiled in previous step <type “make nplsim_comp”>
7. Now let's bring up NPLSIM environment to populate tables and send packets <type “make nplsim_run”>
a. This bring us two xterm windows. One with name BMODEL (console log for behavior model) and another with BMCLI.
8. At this point, we can populate logical tables with runtime entries for match action from BMCLI window. As discussed before, all examples comes with precrafted example table entries.
Users can choose to change it as required . <type “rcload /home/npl/ncsc-1.3.3rc4/examples/npl_tidbits/constructs/strength/strength_cos/bm_tests/corp_net/tbl_cfg_cos.txt” in BMCLI>
9. Now lets inject the pre-crafted packet using our original console window where we invoked NPLSIM model. <type “python bm_tests/corp_net/testPkt.py”>
10. We will see packets transmitted and received on the same console
11. Also in BMODEL xterm window you can see detailed logs on various table hits and the strength resolution based on packets received and tables populated.

Introduction to Network Programing Language (NPL)

Venkat Pullela

Network programming Language (NPL) is designed to specify packet processing pipelines that can be efficiently implemented on different architectures. Apart from basic constructs like parse, tables, re-write, NPL supports advanced constructs like multi-lookup logical tables, functions, strength etc. This is an introduction to the NPL language and a few important constructs.

Program specifies the switching pipeline control flow using all the NPL constructs. The program() provides sequential execution semantics to map the constructs on to the underlying pipeline. Target specific backend compilers may re-order and/or parallelize various blocks without changing the semantics of the program to efficiently use the underlying resources.

program l3_app() {
     parse_begin (start); 
     port_table.lookup (0); 
     parse_continue(ethernet);
     do_vid_assign();
     if (ing_pkt.l2_grp.l2._PRESENT) {
           mac_table.lookup(0); 
          mac_table.lookup(1);
      } 
      if (cmd_bus.l3_enable) { 
          do_l3_forwarding(); 
          do_packet_edits(); 
          do_checksum_update();
     }
}

Parse_begin() and parse_continue() allows support for multi-stage re-entrant parsing. A program() combines table lookups with functions to build an efficient switching pipeline.

Standard Data Types in NPL are bit, bit array and varbit. Bit is used to denote a single bit, and bit array is used to declare fields of fixed length. Variable length fields are declared with the maximum possible field width.

bit cfi; // single bit
bit[12] vid; // 12 bit vid
varbit[64] options; // up to 64b wide

Numbers can be specified in decimal and hex. A non-zero value indicates true and a zero indicates false. All the typical arithmetic and Boolean operators are supported to form expressions.

a = 5;
ipv4.ttl = 0xF;
ipv6.dip = 0x01234567;
if (ipv4.protocol == 0x23)

User defined Type Struct is used to define aggregate data types. It is used to aggregate fields in to a header, form a header group with headers, to define a packet consisting of header groups and a bus with fields.

struct vlan_t { // Header definition
     fields {
          bit[16] tci;
          bit[16] ethertype;
     }
     overlays {
          pcp : tci[15:13];
          dei : tci[12:12];
          vid : tci[11:0];
          full_tag : tci <> ethertype;
     }
}
struct l2_group_t { // Header Group
     fields {
          l2_t l2;
          vlant_t vlan;
     }
}
struct l3_packet_t { // Packet
     fields {
          l2_group_t l2_grp;
          l3_group_t l3_grp;
     }
}
packet l3_packet_t ingress_pkt;
bus cmd_bus_t cmd_bus; // Global Bus

Overlays and concatenation (<> operator) helps power users in using bus and other resources efficiently.

Bus is used to connect different components in a pipeline. Explicitly supporting a logical bus construct helps pipeline designers to think in their own language, especially since NPL is a domain specific language.

Logical table allows specifying user view of the underlying physical tables. Actions are separated from logical tables. This makes them lightweight and improves the throughput. This also moves switching logic out of the table scope to global scope to help common design patterns such as concurrent features. Logical table provides a dis-aggregated view of the Match-Action tables. It is dis-aggregated in to lookup, result, processing and actions to the reduce dependencies and improve the scope for parallelization. Smaller primitives also make the programming paradigm more natural and intuitive instead of force fitting to a single universal primitive.

Table attributes like size, table type attributes may be specified so that appropriate memory like TCAM or SRAM is allocated to the table from common memory.

Table attributes like size, table type attributes may be specified so that appropriate memory like TCAM or SRAM is allocated to the table from common memory.

Logical table supports multiple lookups of the same table. Key_construct() method can be used to specify a custom key for each of the lookups. Specifying conditionals allows different handling of different lookups and results.

key_construct() {
     if (_LOOKUP0==1) {
          mac = ing_pkt.l2_grp.l2.da;
     }
     if (_LOOKUP1==1) {
          mac = ing_pkt.l2_grp.l2.sa;
     }
} 
fields_assign() {
     if (_LOOKUP0==1) { //e.g. Entry 100
          obj_bus.dst = port;
          obj_bus.dst_discard = d_discard;
     }
     if (_LOOKUP1==1) { //e.g. Entry 200
          temp_bus.src_port = port;
          obj_bus.src_discard = s_discard; 
     }
}

Fields_assign() method is used to specify how the results are processed.

Function primitive isused to specify switching logic. Packet processing involves not only data processing of the packet fields and metadata, it is also a decision making process. Function primitive is used to implement the decision trees. Functions are also used to implement logical and arithmetic operations.

function do_l3_forwarding() {
  local_var.no_l3_switch = 0;
  cmd_bus.l3_routable = 0;
  if (cmd_bus.do_l3_lookup &
     cmd_bus.my_stn_routing_enable) {
     if ((ingress_pkt.ipv4.ttl == 0) &&
          (obj_bus.local_address == 0)) {
        local_var.no_l3_switch = 1;
     }
     if ((ingress_pkt.ipv4.ttl == 1) &&
          (obj_bus.local_address == 0)) {
        local_var.no_l3_switch = 1;
     }
     if (ingress_pkt.ipv4.option != 0) {
        local_var.no_l3_switch = 1;
     }
     if (local_var.no_l3_switch == 0) {
        // Output to global bus 
        cmd_bus.l3_routable = 1;
     }
  }
}

Strength construct is used in prioritization and to resolve object dependencies in the decision making process. This allows multiple tables to be looked up in parallel and resolving the result objects based on priority. Profile tables are used to changing the priority of objects at runtime.

Strength resolution allows building a decision tree in multiple steps across the pipeline by combining bus objects with table results as well as carrying the result objects over the bus.

strength_resolve(
  local_bus.cos, // output field
  local_bus.cos_strength, // bus strength
  { pri_tbl._LOOKUP0, // table 1
     NULL, profile_tbl.cos_strength, 
     pri_tbl.cos}, // table 2
  { dscp_tbl._LOOKUP0, NULL, 
     profile_tbl.cos_strength, 
     dscp_tbl.cos});

Editor constructs are used to make modifications of the packet. To make sure the modifications produce a valid packet, editor constructs allow the use of the headers from the parse tree only.

add_header(egr_pkt.group2.tunnel_l2);
delete_header(egr_pkt.group1.otag);
delete_header(egr_pkt.group1);
replace_header_field(egr_pkt.ipv4.dscp, 
               obj_bus.new_dscp);

Special Functions are used for optimal implementation of common pipeline primitives. The functionality of these blocks is programmable. NPL supports flexible selection of inputs and outputs to the special functions. Special functions also support usage mode to select the inputs and outputs at runtime, instead of at compile time.

Network Programming Language (NPL) provides a rich set of constructs to efficiently specify and implement efficient packet processing pipelines. NPL also defined specific constructs to help support parallelization of processing to concurrently support many features at the same time.

The NPL language is completely open and more information can be found at https://nplang.org. The NPL front-end tool chain is available on GitHub.

Why NPL?

Network Programming Language (NPL)

Mohan Kalkunte

With the decoupling of the control plane from the data plane, it is now possible to configure the network elements from a SDN Controller. OpenFlow was the first attempt to describe the data plane of a switch using table semantics. In the last few years, there has been effort to define the data plane functionality in a high-level language. One such language is P4 which is an open source language which continues to evolve.

At Broadcom, network switches have evolved from a fixed function to highly configurable devices. These configurable devices provide significant number of features with a highly efficient pipeline architecture. Broadcom’s Trident family of switches provides a feature rich set of capabilities for Enterprise and Campus data center. The Trident 3 series family is semi-programmable with a tool chain that is factory enabled. This allows Broadcom to introduce new features with a different flex code and not incur the long ASIC development cycle.

The DNX line of devices have long supported programmability starting from the Petra devices since 2010. With every generation of the DNX packet processor, the programmability aspect of the pipeline has improved with Jericho2 device being fully programmable.

Following Trident 3, the next step in the XGS family was to build a highly efficient and programmable pipeline for data plane and instrumentation which led to the recently announced Trident 4. Existing languages did not provide the rich construct needed to express XGS and DNX architectures. Expressing the intent of the data plane for efficient packet processing led us to develop a new programming language called NPL (Network Programming Language) that can efficiently address multiple silicon architectures for switching, routing and appliances.

The key goal of NPL was to have a rich set of constructs to enable highly efficient pipeline architecture. By this we mean the cost of providing programmability over a fixed function should be minimal for the same feature set and table scale. In addition, the need to define the language constructs at a faster pace without being tied to an external standards organization was also a key concern.

NPL at a high level provides the following capabilities: 1) Basic Constructs 2) Advanced Constructs 3) Special functions 4) Instrumentation and 5) Runtime programmability.

Decoupling the action from table lookup and using functions to express complex logic makes for efficient silicon implementation. Key advanced constructs such as multiple table lookups and executing them in parallel is important to deliver a low latency pipeline while delivering high feature capacity. The ability to resolve multiple table lookup decisions based on a programmable strength gives user the freedom to express the capability in a flexible manner. Special functions are like engines which are hardware optimized for specialized functions such as hashing, LAG resolution etc. The inputs to these functions can be defined in NPL by the user and the outputs can be flexibly interpreted by the user and can be expressed in NPL. Runtime programmability is especially important for instrumentation which allows to dynamically attach objects of interest to counters at runtime.

The NPL language is completely open and more information can be found at https://nplang.org. The NPL front-end tool chain is available on GitHub. Currently, NPL is supported on Trident 4 and Jericho 2 devices.