Git Product home page Git Product logo

dash's People

Contributors

andriy-kokhan avatar aputriax avatar ashutosh-agrawal avatar byreal avatar chrispsommers avatar desaimg1 avatar dgalan-xxia avatar jafingerhut avatar kcudnik avatar krisney-msft avatar lguohan avatar marian-pritsak avatar mariobaldi avatar mgheorghe avatar mhanif avatar microsoftopensource avatar mmiele avatar mukeshmv avatar murthyvijay avatar oleksandrivantsiv avatar prsunny avatar pterosaur avatar r12f avatar reshmaintel avatar sanjayth avatar taras-keryk-plv avatar vijasrin avatar vincent-xs avatar vmytnykx avatar yusefms06 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dash's Issues

guid used in dash configuration example.

https://github.com/Azure/DASH/blob/main/documentation/gnmi/design/dash-reference-config-example.md

i see in this document a lot of
"253de6f9-37bd-40ce-9cb2-9715915941d3": {
"qos-id": "253de6f9-37bd-40ce-9cb2-9715915941d3",

It is my opinion that the sample config in the documentation should use very descriptive human readable ids to make it easier to read and to understand.

also any further user facing api's should maintain those names from the config if i call it "dash-qos-rule-77" when i do a config get i should get it back with "dash-qos-rule-77" and if i do show detailed info on qos --rule-id "dash-qos-rule-77" it should work

internally the application may chose to use such guids, but user facing configuration should have user readable ids in form of "names" and what you set is what you get.

IPv6 IP Options extension headers

In bmv2 we are currently not supporting (able to parse/process) extensions. IPv6 has multiple headers. In v4 and v6 we are handling only vanilla headers. Supporting extension headers in bmv2.

Mention of Azure in Glossary

I noticed in the Overlay definition (in the glossary) there was a reference to Azure customers when discussing the APIs. DASH is not owned by Microsoft or Azure and will remain a public and open source of information. We should remove all references to Azure and use other words in it’s place like customer SDN, SDN orchestrator and such.

DASH will be used in the enterprise and other clouds who will orchestrate the services that fit their own individual needs. Over time we all hope that DASH will provide high level guidance to the lower level implementations of smart devices to get the most out of them without overly complicating scenarios with infinite flexibility leading to sub-optimal performance.

UDP Support; waiting on test infrastructure

Related information

Scope - P4 DASH pipeline. Requires definition of UDP header in overlay pipeline headers, proper parsing, and match on L4 ports along with TCP ports.

Notes

  • Needs to be coded in P4 behavioral model
  • No change in bmv2, just P4

dash and sonic parameters naming sync

in my opinion there should be no difference between sonic and sonic-dash for the attributes and values that have the same purpose/functionality.(dash may have extra fields that is fine, but common fields should be called same)

if sonic calls it "packet action" why dash calls it "action"
if sonic uses FORWARD , drop, mirror ...., can sonic-dash also use same values. instead of allow deny

ideology will be to be able to copy paste the config and have it working (provided it uses supported params by both projects)

sonic defines the acl schema as
https://github.com/Azure/SONiC/blob/master/doc/acl/ACL-High-Level-Design.md
"ACL_RULE_TABLE:0d41db739a2cc107:3f8a10ff": {
"priority" : "55",
"IP_PROTOCOL" : "TCP",
"SRC_IP" : "20.0.0.0/25",
"DST_IP" : "20.0.0.0/23",
"L4_SRC_PORT_RANGE: "1024-65535",
"L4_DST_PORT_RANGE: "80-89",
"PACKET_ACTION" : "FORWARD"
},

sonic -dash defines the acl schema as
https://github.com/Azure/DASH/blob/main/documentation/general/design/dash-sonic-hld.md
DASH_ACL_RULE:{{group_id}}|{{rule_num}}
key = DASH_ACL_RULE:group_id:rule_num ; unique rule num within the group.
; field = value
priority = INT32 value ; priority of the rule, lower the value, higher the priority
action = allow/deny
terminating = true/false ; if true, stop processing further rules
protocols = list of INT ',' separated; E.g. 6-udp, 17-tcp
src_addr = list of source ip prefixes ',' separated
dst_addr = list of destination ip prefixes ',' separated
src_port = list of range of source ports ',' separated
dst_port = list of range of destination ports ',' separated

NVGRE or VxLAN or both

Hero test talks about VxLAN as tunneling protocol
https://github.com/Azure/DASH/blob/main/documentation/general/requirements/program-scale-testing-requirements-draft.md

some design documents show NVGRE as tunneling protocol
https://github.com/Azure/DASH/blob/main/documentation/general/design/images/service_tunneling.png

what will be supported by Dash ? NVGRE, VXLAN, or both ?
any other tunneling protocol to be supported ?

and to what extent, will full VxLAN and full NVGRE be supported as per spec?
https://datatracker.ietf.org/doc/html/rfc7348
https://datatracker.ietf.org/doc/html/rfc7637
....

Update HA HLD doc per latest design

Per HA WG meeting, this spec needs updating:
https://github.com/Azure/DASH/blob/main/documentation/high-avail/design/high-availability-and-scale.md

For example, instead of longer ASN prefix to establish preferred routes, the intent is to use load balancing (e.g. hash-based) from the ToR to the DPUs to balance active-active traffic. Default mode of operation is active-active.

The entire doc should be reviewed and refreshed to replace preliminary or obsolete concepts with the latest thinking from the SDN architects. In addition, some more details were explained by @mzms and we should attempt to capture these in writing. Michal proposed we create a "v2" version of the spec.

DASH SAI objects getting appended multiple times to header file

DASH SAI generator: Every time "make sai" is run, the generator is appending duplicate objects to the _sai_object_key_entry_t union.

:~/DASH/dash-pipeline$ grep "sai_direction_lookup_entry_t" SAI/SAI/inc/saiobject.h
    sai_direction_lookup_entry_t direction_lookup_entry;
    sai_direction_lookup_entry_t direction_lookup_entry;
    sai_direction_lookup_entry_t direction_lookup_entry;

Subsequent build of vnet_out fails:

In file included from /SAI/SAI/inc/sai.h:48,

                 from vnet_out.cpp:5:

/SAI/SAI/inc/saiobject.h:122:34: error: redeclaration of 'sai_direction_lookup_entry_t _sai_object_key_entry_t::direction_lookup_entry'

  122 |     sai_direction_lookup_entry_t direction_lookup_entry;

      |                                  ^~~~~~~~~~~~~~~~~~~~~~

/SAI/SAI/inc/saiobject.h:95:34: note: previous declaration 'sai_direction_lookup_entry_t _sai_object_key_entry_t::direction_lookup_entry'

   95 |     sai_direction_lookup_entry_t direction_lookup_entry;

      |                                  ^~~~~~~~~~~~~~~~~~~~~~

Fix routing action in vnet-vnet document

In this section of the doc
https://github.com/mmiele/DASH/blob/cb3d87ddf6247093c67f0ca6970c000d4db3857c/documentation/vnet2vnet-service/design/vnet-to-vnet-service.md#routing-a-packet-to-address-10101

1. Perform LPM lookup.
2. Select routing table DASH_ROUTE_TABLE:10.1.0.0/24. The action type is vnet and the value is Vnet1; and the overlay_ip=10.0.0.6.
3. Look up DASH_ROUTING_TYPE:vnet. The value for vnet is maprouting.

Points 2 and 3 above should have action type / routing type as vnet_direct instead of vnet.

Support for list-match type (P4) for ACLs, pending BMV2 fork

Related information

Scope - BMv2. DASH ACL defines two new match types - list and range_list. They need to be handled by the simulator, so it can match a field in a packet against a list of values and generate a hit if one of the values in the list matches.

Dash build bmv2 container permission error

The bmv2 container image used in the Dash build requires write access to Volumes mounted on the host machine.
However this docker image comes with hard-coded uid/gid assigned in the Dockerfile when building the image.
The build workflow involves using a pre-built bmv2 image pulled from the registry.
When running a container from this docker image, the build commands run from within the container using this uid try to write to the mounted volumes in the host machine resulting in permission issues.

# make sai
Generate SAI library headers and implementation...
docker run -v /home/mukesh/dash/dash-pipeline/bmv2:/bmv2 -v /home/mukesh/dash/dash-pipeline/SAI:/SAI -v /home/mukesh/dash/dash-pipeline/tests:/tests --network=host --rm -it \
        --name build_sai-mukesh \
        -w /SAI chrissommers/dash-bmv2:pr127-220623 \
    ./generate_dash_api.sh
Directory ./lib will be deleted...
Traceback (most recent call last):
  File "./sai_api_gen.py", line 357, in <module>
    shutil.rmtree('./lib')
  File "/usr/lib/python3.8/shutil.py", line 718, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/lib/python3.8/shutil.py", line 675, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/lib/python3.8/shutil.py", line 673, in _rmtree_safe_fd
    os.unlink(entry.name, dir_fd=topfd)
PermissionError: [Errno 13] Permission denied: 'utils.cpp'
Makefile:85: recipe for target 'sai' failed

Application table in P4 pipeline code

Not sure if this is the best place to ask these questions, but here goes.
(This is primarily addressed to Marian)

In the sirius_pipeline.p4 code, "table appliance" is defined.
The table's key is an appliance_id. Do we expect that there are multiple
appliance entries, and if so where does the index come from? Or is
this table really just a single entry with some global attributes?

One of the attributes is "neighbor_mac", and is used for the outer dmac
when encapping the vxlan header stack onto the packet. I wanted to
double-check that there is indeed one such mac address (and that putting
the attribute here was not just a placeholder).

Thanks,
Bud

What is desired behavior for classifying non-first IPv4/IPv6 fragments?

When IPv4/IPv6 packets are fragmented, the first fragment contains the the TCP/UDP L4 ports, and usually also contains the TCP flags fields [1]. Thus first IP fragments contain all the information required within themselves for performing ACL-like classification.

However, non-first IP fragments never contain the L4 ports. Many "stateless" ACL-like classifications, i.e. those that don't do IP reassembly in the data plane, typically define explicit separate rules about what should be done with non-first IP fragments, configured by the user. Without a more stateful mechanism in the data plane, that is the best I think one can do.

If a flow cache is created after ACL-like classifications, is the expectation that non-first fragments must somehow be associated with the first fragment, and match the ACL rules that have L4 ports in them, and be treated the same way as the first fragment was? If yes, then that is a trickier thing to implement in low-cost-per-Gbps data planes. It will be impossible in some data planes, possible in others.

If it is acceptable for non-first IP fragments to be handled in a more expensive general purpose CPU core in software, that opens up much more precise fragment handling behavior possibilities.

If these questions do not make the issue clear, please ask in a follow up comment and I can give more details.

[1] Attackers can send only the first 8 bytes of a TCP header in a first fragment if they are trying to bypass matching of ACL-like classification on TCP flags, and most Cisco routers (for example) discard such first IP fragments if they contain only a partial TCP header. For more discussion on that corner case, see https://datatracker.ietf.org/doc/html/rfc1858

[info request] DASH interop with SONiC

in dash we have static vxlan configuration and i assume dash appliance connects to sonic switches.
is vxlan terminated on the sonic switch ?

what is the typical network topology with the VM's , servers, sonic switches, dash appliances, core router, external connection to outside internet ....

what is the typical configuration that is done on each device? (protocols ...)

Improve drop handling

  • Need to have a consistent plan for packet drops.
    When we decide to drop a packet, do we just mark it for drop or stop processing immediately with a "return"?
    Also need to consider when counters are incremented.

  • The vip table's deny action is logically @defaultonly.
    But making this change would break the SAI generation due to the generation tool.
    If the SAI generation tool is enhanced, then we could make this change.

  • The lookup for table eni_ether_address_map should drop the packet on lookup miss.
    In general, need more negative checking in the code.

Extract default actions etc. from P4 model and add comments to generated SAI header files

Originally commented in #32 (review):

[SAI/sai_api_gen.py]
        for key in table[KEY_TAG]:
            sai_table_data['keys'].append(get_sai_key_data(program, key))

        for action in table['actions']:

Can you extract the P4 default action and render a suitable comment in the generated header file? For example, ACL default actions are deny and this should appear in the generated .h. Likewise, if there are other attributes or subtleties which are buried or implied by the P4 code, can we try to make them explicit in the generated .h header? If you cannot add to the current PR, perhaps create an issue for this so we can track it. Thanks.

XN Tracking: xn in Simulator

Related information

Owner - Nvidia. Scope - BMv2. Requires support for PNA table properties and externs such as idle_timeout_with_auto_delete, add_on_miss, default_action and add_entry, set_entry_expire_time. It will allow for automatic learning and expiring entries.

IPv6 routing support

Related information

Scope - P4 DASH pipeline. Requires definition of IPv6 header in overlay pipeline headers, proper parsing, and match on L3 addresses along with IPv4 addresses.

@marian-pritsak to provide further information, give to Hanif.

Notes

Needs to be coded in P4 behavioral model
No change in bmv2, just P4

Detailed requirements questions on some desired parameter ranges for a DASH device

If these are already documented in the DASH repo, my apologies for the noise in asking, and I would appreciate greatly a pointer to where they are already documented.

What is the maximum timeout interval from seeing first FIN in a TCP connection, until the connection state is deleted, that should be possible to configure?

Is there a maximum rate at which a DASH device must handle received IP fragments? Answers like "must be able to do so at 100% line rate of all received traffic" or "maximum of 50,000 IP fragments per second, and any more than that may be discarded without being processed" are two extreme possible answers, both of which aid product designers in deciding how to handle them.

Simulator Development

Related information

Marian to provide guidance, owner is NVidia

Will every technology provider provide a docker container? Or packaged differently?

To Be Documented: SDN Controller's knowledge of flows and its control of various counters

Based on the discussion in Community meeting today, please help confirm/clarify the following understanding:

  1. SDN Controller does not need to know about the dynamically learned flows. In other words, SDN Controller won't poll the switch for the flows or expects the switch to notify it when a new flow is learned or when an existing flow is deleted.
  2. SDN Controller won't poll the switch for per flow statistics or expect the switch to stream per flow statistics to it periodically or when the flow is deleted
  3. SDN Controller will explicitly create counters for specific entities. We discuss counter per ENI and per routing entry in today's meeting. I also see counters being associated with CA to PA mapping and ACL in P4 model. Are there any other counters the SDN Controller will create?

Connection Tracking: Too loose, UDP and Aging

Does the model remove the flow entirely on the FIN flag? Or does it wait for some time after the FIN flag so that the final ACKs can be forwarded? Is this too loose? The connection should be active only until an ACK packet in each direction covers the sequence number of the FIN packet in the other direction or timeout, whichever occurs first. I believe it would take ~7 states per direction to strictly track the TCP state. Should TCP connection tracking also support TCP window tracking?

What about support for UDP "connections" and the behavior of aging of both UDP and TCP flows?

What is the desired behavior when an ACL or route table configuration changes that potentially affects the forwarding behavior of an established flow? The specifications describe a slow path and fast path where the ACL and route lookups are avoided in the fast path (for performance). The current behavioral model appears to execute the ACL and route lookup on each and every packet. As a result there is no behavioral difference between the slow path and a fast path. This implies that implementations with both a fast and a slow path must re-evaluate ACL/route lookups for flows whenever configuration changes occurs that may affect the flows. Is this correct? Should the behavioral model explicitly model a fast and a slow path?

https://github.com/Azure/DASH/blob/6b2a638d6f620469fdff59716c65bb7286b5ef61/sirius-pipeline/sirius_conntrack.p4#L42

See PR PNA compatible connection tracking #21

IPv6 ACL support

Needs to be coded in P4 behavioral model
No change in bmv2, just P4

Notes
Hi Mohammad, the following list of tables will require IPv6 support:
sirius_outbound.routing
sirius_outbound.ca_to_pa
ACL stages (sirius_acl file)
All those tables match on IPv4 address. We need to support matching on either IPv4 address or IPv6 address depending on the packet's overlay ethertype.
Per Mario Baldi - Collaborator
We will need IPv6 support also in connection tracking, hence generally throughout sirius_conntrack.p4. My suggestion would be to first finalize connection tracking with the proper connection removal behavior just for IPv4 (building on the current version of the code) and then later add IPv6 support.

Detail questions on dash-handling-fragmented-packets.md

These statements are made in the latest version of that file as of 2022-Feb-23:

  • If a subsequent packet arrives that is the start of a fragmented packet, the Frag ID must be used to create a new temporal flow that can be uniquely identified by the (Frag ID, DST, SRC) tuple.

  • If the connection is closed with the arrival of the FIN packet then all temporal flows must be closed as well.

The last statement seems to be assuming that temporal flows can be associated with a 5-tuple. That is true if the first IP fragment arriving for one original unfragmented packet contains the L4 header.

Is there a preferred behavior if the first IP fragment for a particular (Frag ID, DST, SRC) tuple is a non-first fragment, and contains no L4 header information, and thus cannot be associated with a particular 5-tuple (at least not yet)?

Translation: P4 Runtime layer SAI -> Simulator

Related information

NVidia - this is the 'glue' layer. Simulator exposes P4Runtime API for configuration. The goal is to align it with the SAI api by providing a translation from SAI to P4Runtime.

Add additional HA requirements from https://github.com/Azure/DASH/pull/56/ into high-availability-and-scale.md

#56 contains the following content in the proposal document. If this is indeed the case, these requirements should go into https://github.com/Azure/DASH/blob/main/documentation/high-avail/design/high-availability-and-scale.md

Microsoft has provided some additional requirements:

  1. HA Interoperability is required between vendors
    • Pairing cards from different vendors is not the typical deployment, but must work
  2. The HA packet format and protocol must be public
    • This allows sniffed/mirrored HA messages to be analyzed
    • No vendor-private protocol is allowed
  3. The HA protocol for syncing active flows could have a base mode and optional modes
    • Additional modes could be defined, for example to the reduce PPS/bps needed for the active sync messages
    • A vendor only needs to support the base mode
    • Any optional modes must also be public
  4. The HA protocol does not need to reliably sync 100% of the flows between cards
    • Ideally all flows are synced. But is ok if a small number of flows (hundreds out of 10s of millions) are missed.

Questions on Program Scale Testing Requirements

The "Program Scale Testing Requirements for LAB Validation" document stated that
"b. Download new policies and delete old policies at a significant rate to ensure that CPS, Active Connections, Aging, and new
Policies are properly handled with the external memory, which is often the bottleneck for performance."

Please help provide the additional info:

  1. What will be the "significant rate" that we should expect for the policies being added and deleted?
  2. What are the expected behaviors of the new policies on the existing active connections? For example, if the new policy results in a deny action for the existing connections, should the impacted connections be removed? If yes, is there an expectation on the rate of connection removal?

Specify Inter-DPU HA flow sync communications requirements/restrictions

@mzms @lguohan Please state requirements/expectations for Inter-DPU HA flow sync communications requirements/restrictions:

All communications between cards travels over the datacenter network i.e. DPU-ToR-DPU
Are there any control-plane interactions, or is it DPU-DPU?
What protocols allowed (TCP/UDP, unicast/multicast)?
What IP address endpoints should be used (New one, same as gNMI management address but different port, etc.?)
What SLA for the DPU-DPU path can be guaranteed (e.g. will a burst of sync updates experience packet drops/throttling)?
Can vendors implement their own protocols, or must there be an interoperability standard ? Is there room for both?
Please articulate continuous updating vs. live migration?

ixia-c tests can fail if protobuf already installed with incompatible version

Use of snappi package 0.7.37 as specified in https://github.com/Azure/DASH/blob/main/test/requirements.txt could cause snappi client errors if an incompatible version of protobuf python package is already installed.

Per ixia-c support Slack channel (https://ixia-c.slack.com/archives/C021DU5026R/p1657894469742489?thread_ts=1657847300.692199&cid=C021DU5026R), this can be fixed by upgrading to snappi 0.7.38. I will file a PR to do so.

An example error output is show below:

dash@chris-z4:~/chris-dash/DASH/dash-pipeline$ make run-all-tests 
# Ensure P4Runtime server is listening
t=5; \
while [ ${t} -ge 1 ]; do \
	if sudo lsof -i:9559 | grep LISTEN >/dev/null; then \
		break; \
	else \
		sleep 1; \
		t=`expr $t - 1`; \
	fi; \
done; \
docker exec -w /tests/vnet_out simple_switch-dash ./vnet_out
GRPC call SetForwardingPipelineConfig 0.0.0.0:9559 => /etc/dash/dash_pipeline.json, /etc/dash/dash_pipeline_p4rt.txt
GRPC call Write::add_one_entry OK: GRPC call Write::add_one_entry OK: GRPC call Write::add_one_entry OK: Done.
python3 -m pip install -r ../test/requirements.txt
Requirement already satisfied: snappi==0.7.37 in /home/dash/.local/lib/python3.8/site-packages (from -r ../test/requirements.txt (line 1)) (0.7.37)
Requirement already satisfied: pytest==6.0.1 in /home/dash/.local/lib/python3.8/site-packages (from -r ../test/requirements.txt (line 2)) (6.0.1)
Requirement already satisfied: urllib3 in /usr/lib/python3/dist-packages (from snappi==0.7.37->-r ../test/requirements.txt (line 1)) (1.25.8)
Requirement already satisfied: grpcio-tools==1.44.0; python_version > "2.7" in /home/dash/.local/lib/python3.8/site-packages (from snappi==0.7.37->-r ../test/requirements.txt (line 1)) (1.44.0)
Requirement already satisfied: PyYAML in /usr/lib/python3/dist-packages (from snappi==0.7.37->-r ../test/requirements.txt (line 1)) (5.3.1)
Requirement already satisfied: grpcio==1.44.0; python_version > "2.7" in /home/dash/.local/lib/python3.8/site-packages (from snappi==0.7.37->-r ../test/requirements.txt (line 1)) (1.44.0)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from snappi==0.7.37->-r ../test/requirements.txt (line 1)) (2.22.0)
Requirement already satisfied: pluggy<1.0,>=0.12 in /home/dash/.local/lib/python3.8/site-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (0.13.1)
Requirement already satisfied: more-itertools>=4.0.0 in /usr/lib/python3/dist-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (4.2.0)
Requirement already satisfied: toml in /home/dash/.local/lib/python3.8/site-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (0.10.2)
Requirement already satisfied: attrs>=17.4.0 in /usr/lib/python3/dist-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (19.3.0)
Requirement already satisfied: py>=1.8.2 in /home/dash/.local/lib/python3.8/site-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (1.11.0)
Requirement already satisfied: packaging in /home/dash/.local/lib/python3.8/site-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (21.3)
Requirement already satisfied: iniconfig in /home/dash/.local/lib/python3.8/site-packages (from pytest==6.0.1->-r ../test/requirements.txt (line 2)) (1.1.1)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from grpcio-tools==1.44.0; python_version > "2.7"->snappi==0.7.37->-r ../test/requirements.txt (line 1)) (45.2.0)
Requirement already satisfied: protobuf<4.0dev,>=3.5.0.post1 in /usr/local/lib/python3.8/dist-packages (from grpcio-tools==1.44.0; python_version > "2.7"->snappi==0.7.37->-r ../test/requirements.txt (line 1)) (3.6.1)
Requirement already satisfied: six>=1.5.2 in /usr/lib/python3/dist-packages (from grpcio==1.44.0; python_version > "2.7"->snappi==0.7.37->-r ../test/requirements.txt (line 1)) (1.14.0)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /home/dash/.local/lib/python3.8/site-packages (from packaging->pytest==6.0.1->-r ../test/requirements.txt (line 2)) (3.0.9)
cd ../test/third-party/traffic_gen && ./deploy_ixiac.sh
.
deployment_traffic_engine_1_1 is up-to-date
deployment_controller_1 is up-to-date
deployment_traffic_engine_2_1 is up-to-date
# Ensure P4Runtime server is listening
t=5; \
while [ ${t} -ge 1 ]; do \
	if sudo lsof -i:9559 | grep LISTEN >/dev/null; then \
		break; \
	else \
		sleep 1; \
		t=`expr $t - 1`; \
	fi; \
done; \
docker exec -w /tests/init_switch simple_switch-dash ./init_switch
GRPC call SetForwardingPipelineConfig 0.0.0.0:9559 => /etc/dash/dash_pipeline.json, /etc/dash/dash_pipeline_p4rt.txt
Switch is initialized.
python3 -m pytest ../test/test-cases/bmv2_model/ -s
================================================================================================== test session starts ===================================================================================================
platform linux -- Python 3.8.10, pytest-6.0.1, py-1.11.0, pluggy-0.13.1
rootdir: /home/dash/chris-dash/DASH
collected 0 items / 1 error                                                                                                                                                                                              

========================================================================================================= ERRORS =========================================================================================================
____________________________________________________________________________ ERROR collecting test/test-cases/bmv2_model/test_hello_world.py _____________________________________________________________________________
../../../.local/lib/python3.8/site-packages/snappi/otg_pb2_grpc.py:7: in <module>
    import otg_pb2 as otg__pb2
E   ModuleNotFoundError: No module named 'otg_pb2'

During handling of the above exception, another exception occurred:
../test/test-cases/bmv2_model/test_hello_world.py:1: in <module>
    import snappi
../../../.local/lib/python3.8/site-packages/snappi/__init__.py:1: in <module>
    from .snappi import Config
../../../.local/lib/python3.8/site-packages/snappi/snappi.py:17: in <module>
    from snappi import otg_pb2_grpc as pb2_grpc
../../../.local/lib/python3.8/site-packages/snappi/otg_pb2_grpc.py:9: in <module>
    from snappi import otg_pb2 as otg__pb2
../../../.local/lib/python3.8/site-packages/snappi/otg_pb2.py:23: in <module>
    _CONFIG = DESCRIPTOR.message_types_by_name['Config']
E   AttributeError: 'NoneType' object has no attribute 'message_types_by_name'
==================================================================================================== warnings summary ====================================================================================================
/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/containers.py:182
  /usr/local/lib/python3.8/dist-packages/google/protobuf/internal/containers.py:182: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
    MutableMapping = collections.MutableMapping

/usr/local/lib/python3.8/dist-packages/google/protobuf/internal/containers.py:340
  /usr/local/lib/python3.8/dist-packages/google/protobuf/internal/containers.py:340: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
    collections.MutableSequence.register(BaseContainer)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================================================================================ short test summary info =================================================================================================
ERROR ../test/test-cases/bmv2_model/test_hello_world.py - AttributeError: 'NoneType' object has no attribute 'message_types_by_name'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================================================================== 2 warnings, 1 error in 0.98s ==============================================================================================
make: *** [Makefile:355: run-ixiac-test] Error 2

sirius-pipeline : utils.cpp: In function ‘p4::config::v1::P4Info parse_p4info(const char*)’: google::protobuf::io issue

Hello DASH community and @marian-pritsak
Here i followed the steps mentioned in https://github.com/Azure/DASH/blob/main/sirius-pipeline/README.md

  1. make clean >>> working fine
  2. make bmv2/sirius_pipeline.bmv2/sirius_pipeline.json >>> working fine
  3. make sai >>> not working well
    make sai has 2 steps one updating SAI folder and second compile created lib folder , seeing issue in compiling the lib fodler

Issue in detail:
A)
Updated SAI folder with teh steps mentioned in generate_dash_api.sh which is in SAI folder
sudo ./SAI/sai_api_gen.py bmv2/sirius_pipeline.bmv2/sirius_pipeline_p4rt.json --ignore-tables=appliance,eni_meter,slb_decap --overwrite=true dash
B)
once SAI folder updated and correspodng lib folder created , the next step is
cd lib
sudo make which results the following issue
$ sudo make
g++
-c
-I ../SAI/inc/
-I ../SAI/experimental/
-fPIC
-g
utils.cpp
saidash.cpp saidashacl.cpp saidashvnet.cpp
utils.cpp: In function ‘p4::config::v1::P4Info parse_p4info(const char*)’:
utils.cpp:62:27: error: ‘IstreamInputStream’ is not a member of ‘google::protobuf::io’; did you mean ‘CodedInputStream’?
62 | google::protobuf::io::IstreamInputStream istream_(&istream);
| ^~~~~~~~~~~~~~~~~~
| CodedInputStream
utils.cpp:63:42: error: ‘istream_’ was not declared in this scope; did you mean ‘istream’?
63 | google::protobuf::TextFormat::Parse(&istream_, &p4info);
| ^~~~~~~~
| istream
make: *** [Makefile:2: libsai.so] Error 1

ASK
Need help on how to resolve this issue ? this is blokcing to move to next steps like
make test
make run-test

Add BGP failover details in HA Spec

https://github.com/chrispsommers/DASH/blob/main/documentation/high-avail/design/high-availability-and-scale.md has statements describing failover mechanisms such as:

In case of failure the BGP routes from "SDN Appliance 1" (previously active) will be withdrawn and TOR will prefer "SDN Appliance 2" and redirect traffic there, ensuring continuous traffic and uninterrupted customer experience.

It would be helpful to add details such as:

  • Expand on "BGP routines ...will be withdrawn." There's are assumptions in that statement, e.g. is BFD (Bidirectional Forwarding Detection) assumed? What are BFD timer values (100 msec)?
  • Unplanned failover downtime is stated as < 2 sec, which is a system limit. What are the potential contributors to this downtime, and how shall we budget the downtime amongst them?

Open sourcing SDN (Bluebird) Agent

When does Microsoft plan to open source the SDN (Bluebird) agent? As part of the SONiC DASH, we will need the SDN agent for solution testing.

Definition of non-standard P4 constructs in sirius P4 code, and plan for how to run them?

Examples:

  • The new match kinds list and range_list
  • New keywords state_context and state_graph

To my knowledge, these are not supported by the latest open source P4 compiler and behavioral-model / BMv2 software switch.

Is there a plan to release an open source implementation of P4 compiler and BMv2 (or other software switch) that can compile these programs?

Or perhaps to release alternate versions of these P4 programs, perhaps ones mechanically translated from these into P4 that can be compiled and run on an open source P4-programmable software switch?

Or some other plan to enable others to run the reference code, in some way?

Testing: Discuss test plan for TCP State machine

Black box (as described in the Test HLD, config + packets in/out), vs White box to observe the State (e.g. of the TCP state machine).

Gerald asked if we could now perform sirius-pipeline TCP state-machine testing, e.g. by sending in packets in a specific sequence and reading the “state” of the TCM state-machine at each transition to verify operation. Gerald feels this is necessary to qualify an implementation (e.g. in the lab - it doesn’t have to be done at full-speed nor run in production with this observability). Lots of discussion, some comments:
o Chris – yes you can send specific sequences of packets using the test framework (ixia-c/snappi) but there are no APIs to read TCP state-machine states in the existing design. All current APIs derive from the P4 model itself, so using this approach, we’d have to “model” such APIs as P4 registers or pseudo-tables. Also note that there is no stateful bmv2 implementation at this time, the current one is vanilla bmv2 from the p4lang repo.
o Marian – there are no APIs for this. Also since it would not be a production feature, this is additional work for vendors. Can’t we instead use black-box testing? This means sending in sequences of packets and testing the expected output, without explicitly reading internal states. (Gerald’s proposal is essentially “white-box” testing where we can read the internal state of the system.)
o Consensus was to continue this discussion in the DASH behavioral model WG.

Invalid match type: 'list' error when try to run bmv2 switch for sirius_pipeline project

I followed the same steps as mentioned in README in https://github.com/Azure/DASH/tree/main/sirius-pipeline

  • Build the environment >>>> worked well
    make docker
  • Build pipeline >>>> worked well
    make clean
    make bmv2/sirius_pipeline.bmv2/sirius_pipeline.json
  • Run software switch >>>> did't work well
    make run-switch

when i debugged i see below one is the main issue
$ p4c -b bmv2 bmv2/sirius_pipeline.p4 -o bmv2/sirius_pipeline.bmv2 >>>> worked well
$ simple_switch --log-console --interface 0@veth0 --interface 1@veth2 /bmv2/sirius_pipeline.bmv2/sirius_pipeline.json >>>> did't work well
Calling target program-options parser
Invalid match type: 'list'

  1. Here iam attaching compiled program output sirius_pipeline.json as sirius_pipeline.txt becasue .json is not allowed to attach in this issue tracker.
    sirius_pipeline.txt

Could some one suggest possible solution or work around

Improve SDN Features, Packet Transforms and Scale document

This issue collects information and related PRs intended to improve the document: SDN Features, Packet Transforms and Scale
Tasks that may produce related documentation:

  • Routing guidelines. WIP, see #67
  • Packet Flow and Transforms. WIP, see #114
  • VM to VM communication scenario in VNET. Work in draft form not yet available for PR
  • Support features such as Telemetry, Metering, Counters, Billing, Watchdogs, BGP?
  • Servicing
  • Update SDN Features, Packet Transforms and Scale accordingly. WIP.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.