1.1.a SDM templates
1.1.a SDM templates
the following is from mark holm 3xccie. he posted this on the cisco learning network here:
he was kind enough to allow me to reprint it on ccieordie.com, and i am much obliged. if you are not utilizing the cisco learning network on a daily basis, as well as cisco.com for the massive amount of free resources, then you are simply NOT trying…
One of the key components of virtually any layer 2 or layer 3 switch is the TCAM. TCAM is an expensive component and is typically a scarce resource on many switching platforms. This especially holds true on lower-end platforms such as the 3560-X/3750-X, where the administrator must make a decision on how to carve up the available TCAM resources, so that it fits with the role the switch plays in the network.
This document was written with the Catalyst 3750-X platform in mind, so there may be differences to other Catalyst platforms, but generally the concepts are identical.
CAM versus TCAM
Before going into the details of TCAM, we need to establish what a CAM is. First of all, it’s another of those three-letter acronym we’re all so accustomed to: Content Addressable Memory. As suggested by its name it is a type of memory. Compared to traditional RAM as found in your laptop, CAM operates in a completely different way. Traditional RAM will simply return the content of memory cells at the address you specify when you want to access content stored in RAM. When working with CAM, you don’t know the specific memory location where the content you are interested in is located – but you know what you want to find. So you use a binary key, which the CAM uses to search its entire memory for occurrences of that specific key. For each occurrence found, the CAM will return the address of where to find that specific content. Typically, the content (or more precisely key) will be the MAC address and/or the VLAN ID of a connected end-host, and as such used to return relevant information pertaining to that specific MAC address, such as the associated VLAN and/or the port where it was last seen. All searches are performed using an exact match on the key that was input to the search operation, so if we’re looking for MAC address 000D.29A8.B28E, the search will only return memory addresses where the MAC address matches completely to the content in the memory, so the search won’t return 000D.29A8.B28D or 000D.29A8.B38E. A CAM is often referred to as a binary CAM due to its ability to match only on 0’s and 1’s. The information returned from a binary search includes the VLAN and/or physical port, which allows the switch to forward the traffic to the correct egress port.
Switches uses an extension of a CAM, known as the TCAM or Ternary Content Addressable Memory. Whereas the binary CAM can match only on the traditional binary zeroes and ones, TCAM also can match a third state: any value or the don’t care. This third state is called ‘X’, so a TCAM can work with three different values: ‘0’, ‘1’, and ‘X’, hence the name ‘Ternary’. So, if the value ‘0111XX10’ is associated with a TCAM entry, searching for either of the following values will result in a match:
The ‘X’ value is not stored as an ‘X’, but is implemented using masks, which in many aspects resembles the subnet masks/prefix lengths used in the IPv4 and IPv6 world. Mask values is what allows ACLs to be efficiently compiled and programmed to the TCAM. The process of compiling ACLs and programming them into the TCAM is commonly referred to as ‘merging ACLs into TCAM’ and is handled by the Feature Manager (FM), which is a software component inside the switch.
Each TCAM entry consist of three components: Value, Mask and Result. Both the value and mask fields are 134 bits in length, but it does not necessarily mean that all 134 bits will be actively used. The actual bit utilization depends on the type of ACL being implemented. Regardless of ACL types, value and mask fields use the exact same bit order. If a mask bit is set means that the corresponding bit position in the value field must match, while a mask bit that is not set means that the corresponding bit position in the value field does not matter. Results are implemented as numeric values that represent the action to take after the TCAM lookup. Beside the basic permit/deny actions found with traditional access lists, the result can also contain a pointer to a next-hop routing table or an index to a QoS policer etc.
TCAM entries are organized by the masks, with each unique mask having up to eight values associated with it. This allows each of the mask-value pairs to be evaluated simultaneously, so the best or longest match is found with a single lookup operation, which contributes to the fast operation of the TCAM. Once a source/destination mask pair contains eight values, a new mask pair with the same masks is created to accommodate for eight additional values.
The following is a very simple illustration of how an ACL is merged into the TCAM. It shows the general principles only, and it does not necessarily match how it would be implemented in a real-world TCAM, as there may be some optimizations and adjustments that are made by the TCAM algorithm itself.
Consider the following ACL that you wish to implement on a given switch:
ip access-list extended FILTER
permit tcp any 10.128.0.0 0.0.0.255 eq 8080
permit tcp 192.168.0.0 0.0.0.255 10.128.0.0 0.0.0.255 eq 80
permit tcp 10.128.0.0 0.0.0.255 192.168.0.0 0.0.0.255 eq 443
deny ip 10.128.0.0 0.0.0.255 192.168.0.0 0.0.0.255
deny udp 172.16.0.0 0.0.255.255 10.10.0.0 0.0.255.255 eq 5060
When the access-list is entered into the configuration, the Feature Manager’s responsibility is to compile the ACL and program the TCAM with the information from the ACL. The FM starts by identifying unique masks pairs (source/destination) in the ACL. Our test ACL have three unique mask pairs:
Match 24 bits of destination address – the any keyword is implemented as all ‘X’ values for the source address (line 1)
These masks are then organized based on their masks and entered into the TCAM. Again, remember this is a simplified example, so details have deliberately been left out.
Mask 1 contains the following:
The values associated with Mask 1:
Mask 2 contains the following:
The values associated with Mask 2:
Mask 3 contains the following:
The values associated with Mask 3:
As can be seen with Mask 2, this method of compiling the ACL’s provide an efficient lookup tool that allows the ASIC to compare multiple source/destination IP pairs in one operation. Once there is a hit on the lookup, the ASIC will perform a memory lookup to determine what treatment the packet should get, which in case of a security ACL is either permit or deny. In the Catalyst 3750-X the TCAM is integrated in the ASIC which ensures a low internal signal path which in turn yields shorter handling times.
When a frame is received on a port, a copy of the first 200 bytes of the packet is copied to the forwarding controller, which is responsible for performing the actual lookups in the TCAM. The 200 bytes contains the necessary information required to perform forwarding decisions (VLANs, egress port(s) etc.) and determine the treatment the packet should receive in terms of applying QoS and ACL’s. Usually, three TCAM lookups will be made during processing of a packet, but it depends on the architecture of the specific platform:
Between the TCAM lookups, there are several additional steps that are performed as part of the forwarding process inside the ASIC, but these are irrelevant for the sake of describing the TCAM. As you may have guessed by now, the TCAM lookups (which are very, very fast), are one of the key reasons why switches are able to forward packets at line rate on all ports (this of course assumes the switch architecture itself is not purposely built with oversubscription).
Now that the TCAM is described in some detail, it’s time to know how to control the allocation of TCAM resources to certain functions. Normally, as an end-user, you won’t be exposed heavily to the details and operations of the TCAM, as the TCAM isn’t really user configurable per se. But as briefly touched in the introduction, one of the most important, and yet overlooked, configuration options on lower-end Catalyst switching platforms such as the Catalyst 2K and Catalyst 3K is the SDM (Switching Database Manager) template. The SDM template defines how the available TCAM resources are carved up to reserve space for different applications. Configuring the SDM template is just one single command followed by a switch reload, so it is in no way a complex procedure to carry out. In many cases the default template will be sufficient, so for many people it’s all about making a judgement call on whether the default SDM template will be sufficient. The available SDM templates vary between the different switch platforms, so you need to consult the documentation for your platform to determine which ones are available, and which one is the right to use.
The TCAM is split into five main areas or partitions:
The SDM template adjusts the size of each of these areas or partitions, and each of partitions may contain several tables. The exact details of the tables are not publicly described in detail, but can be examined using various show platform tcam subcommands. The following SDM template resource values are taken from the Catalyst 3750-X configuration guides available at cisco.com:
One thing, I’ve personally seen many people wondering about is the apparent lack of IPv6 support on Catalyst 3K switches. With the default SDM template applied, it is not possible to configure IPv6 on the switch. To enable IPv6 support, one of SDM templates with IPv6 support must be activated on the switch. The reason for this requirement is the fact that IPv6 addresses take up more TCAM resources because of their length (128 bit address length vs. 32 bit). This requires the TCAM resources to be carved out in a different way to make room for the longer addresses. The dual-stack SDM templates available on the Catalyst 3750-X platform is as follows:
As can be deducted from the two tables above, enabling one feature will usually result in a decreased capacity for other features or even disable them entirely. That is the main reason why it is so important to carefully consider which template to apply when the switch is deployed. Of course the decision isn’t definite – it can be changed at any time, but as it requires a reload of the switch, it causes service disruptions and naturally this is something we all want to avoid as much as possible.
The SDM templates are predefined and can’t be modified or customized. So from time to time there will be cases where the templates don’t match exactly with the intended switch role – in those cases it may prove necessary to choose a template that has some tradeoffs.
Choosing a wrong template can have a huge negative impact on the switch performance. Consider a scenario where a 3750-X switch is re-deployed as a distribution switch for a large enterprise network with 5.000 clients. The switch previously acted as a regional router, so the switch is configured to use the “Routing” SDM template. After deploying the switch, the CPU utilization of the switch periodically comes close to 100% and users are complaining about poor performance. Checking the current TCAM utilization during a period of high CPU utilization using the show platform tcam utilization command shows that approximately 4.300 MAC addresses are active on the switch. This is well beyond the 3.000 MAC addresses supported with the “Routing” SDM template. When the resources for one of the features is exhausted, the switch starts to process the particular feature in software using the general-purpose CPU. In case of layer 2 or layer 3 switching it means that the switch actually performs the switching function in software instead of using the ASIC’s (known as process switching – the packet is punted to the CPU), which naturally will result in a much lower performance compared to leaving the job to purpose-built ASICs, meaning that you no longer will have line rate your ports.
On a sidenote, high-end switching platforms such as the Catalyst 4500, Catalyst 6500, and Catalyst 6800 does not use the concept of SDM templates, but they still have (plenty of) TCAM resources available. The TCAM resources on these switches are predefined and cannot be modified. Typically, these switches feature more TCAM resources compared to lower-end switches – and often they will have TCAM’s that are dedicated for specific purposes such as QoS ACLs and security ACLs. Details on available TCAM resources on these platforms can be found in the data sheets for linecards and supervisor modules for the particular platform.
If you would like to get additional information on this topic and switch architectures, Cisco Live archives provides you with a galore of information. For this topic, there is a detailed description of the Catalyst 3750-X architechture (which are used as basis in this document), go to https://www.ciscolive.com/online/connect/search.ww and search for BRKCRS-3141 (registration/login is required). It is absolutely worth reading the PDF and watching the video.
The topic is also covered in the official CCNP SWITCH book from Cisco Press.
1.1.a SDM templates
You can use SDM templates to configure system resources in the switch to optimize support for specific features, depending on how the switch is used in the network. You can select a template to provide maximum system usage for some functions; for example, use the default template to balance resources, and use access template to obtain maximum ACL usage. To allocate hardware resources for different usages, the switch SDM templates prioritize system resources to optimize support for certain features.