• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

SMART Modular Technologies Introduces New Family of CXL Add-in Cards for Memory Expansion

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,476 (7.66/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
SMART Modular Technologies, Inc. ("SMART"), a division of SGH (Nasdaq: SGH) and a global leader in memory solutions, solid-state drives, and advanced memory, announces its new family of Add-In Cards (AICs) which implements the Compute Express Link (CXL) standard and also supports industry standard DDR5 DIMMs. These are the first in their class, high-density DIMM AICs to adopt the CXL protocol. The SMART 4-DIMM and 8-DIMM products enable server and data center architects to add up to 4 TB of memory in a familiar, easy-to-deploy form factor.

"The market for CXL memory components for data center applications is expected to grow rapidly. Initial production shipments are expected in late 2024 and will surpass the $2 billion mark by 2026. Ultimately, CXL attach rates in the server market will reach 30% including both expansion and pooling use cases," stated Mike Howard, vice president of DRAM and memory markets at TechInsights, an intelligence source to semiconductor innovation and related markets.



"The CXL protocol is an important step toward achieving industry standard memory disaggregation and sharing which will significantly improve the way memory is deployed in the coming years," said Andy Mills, senior director of advanced product development at SMART Modular, reinforcing Howard's market analysis and SMART's rationale for developing this family of CXL-related products.

SMART's 4-DIMM and 8-DIMM AICs are built using advanced CXL controllers which eliminate memory bandwidth bottlenecks and capacity constraints for compute-intensive workloads encountered in Artificial Intelligence (AI), high performance computing (HPC), and Machine Learning (ML). These emerging applications require larger amounts of high-speed memory that exceed what current servers can accommodate. Attempts to add more memory via the traditional DIMM-based parallel bus interface is becoming problematic due to pin limitations on CPUs, so the industry is turning to CXL-based solutions which are more pin efficient.

Technical Specifications
About SMART's 4-DIMM and 8-DIMM DDR5 AICs
  • Available in type 3 PCIe Gen 5 Full Height, Half Length (FHHL) PCIe form factor.
  • The 4-DIMM AIC (CXA-4F1W) accommodates four DDR5 RDIMMs with a maximum of 2 TB of memory capacity when using 512 GB RDIMMs, and the 8-DIMM AIC (CXA-8F2W) accommodates eight DDR5 RDIMMs with a maximum of 4 TB of memory capacity.
  • The 4-DIMM AIC uses a single CXL controller implementing one x16 CXL port while the 8-DIMM AIC uses two CXL controllers to implement two x8 ports, both resulting in a total bandwidth of 64 GB/s.
  • The CXL controllers support "Reliability, Availability, and Serviceability" (RAS) features, and advanced analytics.
  • Both offer enhanced security features with in-band or side band (SMBus) monitoring capability.
  • To accelerate memory processing, these add-in cards are compatible with SMART's Zefr ZDIMMs.
CXL also enables lower cost scaling of memory capacity. Using Smart's AICs enables servers to reach up to 1 TB of memory per CPU with cost-effective 64 GB RDIMMs. They also offer an opportunity for supply chain optionality. Replacing high density RDIMMs with a greater number of lower density modules can enable lower system memory costs depending on market conditions.

Visit SMART's 4-DIMM product page and 8-DIMM AIC product page for further information, and the CMM/CXL family page for information on SMART's other products using the CXL standard. SMART will provide samples to OEMs upon request. These new CXL-based AIC products join SMART's ZDIMM line of DRAM as ideal solutions for demanding memory design-in applications.

View at TechPowerUp Main Site
 
Joined
Mar 7, 2011
Messages
3,979 (0.83/day)
Installing DIMMs into that 4 slot card is going to be a pain compared to 8 slot version where DIMMs are perpendicular to card(though that comes are cost of no of slots used).
 
Joined
Jun 1, 2021
Messages
207 (0.19/day)
I think this would be interesting if it came to client/consumer platforms. Imagine if you could use a PCIe x8 one with one or two DIMMs to expand your Memory pool with 32GB/s or so?

Could be useable for a lot of applications, I think. Like a huge buffer space for games.
 
Joined
Feb 22, 2022
Messages
531 (0.65/day)
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Crosshair VIII Dark Hero
Cooling Custom Watercooling
Memory G.Skill Trident Z Royal 2x16GB
Video Card(s) MSi RTX 3080ti Suprim X
Storage 2TB Corsair MP600 PRO Hydro X
Display(s) Samsung G7 27" x2
Audio Device(s) Sound Blaster ZxR
Power Supply Be Quiet! Dark Power Pro 12 1500W
Mouse Logitech G903
Keyboard Steelseries Apex Pro
I think this would be interesting if it came to client/consumer platforms. Imagine if you could use a PCIe x8 one with one or two DIMMs to expand your Memory pool with 32GB/s or so?

Could be useable for a lot of applications, I think. Like a huge buffer space for games.
You're still limited by the PCIe bus speed, which in your example of PCIe x8 would be twice the transfer speed of a nvme drive. So why bother, just stick with nvme drives. Alot more bang for the buck and a less complicated storage hierarchy.

I mean there are use cases for this. Certain types of servers being one of them. But for something like a huge buffer space for games? Those already exist. They are called nvme ssds.
 
Joined
Jan 3, 2021
Messages
2,764 (2.25/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
63.015 GB/s? 64 GT/s.
That's the best case for large transfers. Small transfers have a large percentage of overhead: a PCIe packet header is ~16 bytes and some commands and addresses have to be transmitted too. In contrast, DDR has a command/address bus that's separate from data bus. Small transfers can be really small - 64 bytes in DDR, which equals one cache line in current CPUs.

On the other hand, in PCIe's favour, that 64 GB/s is in each direction.
 
Joined
Sep 1, 2020
Messages
2,057 (1.52/day)
Location
Bulgaria
That's the best case for large transfers. Small transfers have a large percentage of overhead: a PCIe packet header is ~16 bytes and some commands and addresses have to be transmitted too. In contrast, DDR has a command/address bus that's separate from data bus. Small transfers can be really small - 64 bytes in DDR, which equals one cache line in current CPUs.

On the other hand, in PCIe's favour, that 64 GB/s is in each direction.
Screenshot_2024-04-25-03-38-34-33_40deb401b9ffe8e1df2f1cc5ba480b12.jpg
@Wikipedia.
 
Joined
Aug 22, 2007
Messages
3,464 (0.57/day)
Location
CA, US
System Name :)
Processor Intel 13700k
Motherboard Gigabyte z790 UD AC
Cooling Noctua NH-D15
Memory 64GB GSKILL DDR5
Video Card(s) Gigabyte RTX 4090 Gaming OC
Storage 960GB Optane 905P U.2 SSD + 4TB PCIe4 U.2 SSD
Display(s) Alienware AW3423DW 175Hz QD-OLED + Nixeus 27" IPS 1440p 144Hz
Case Fractal Design Torrent
Audio Device(s) MOTU M4 - JBL 305P MKII w/2x JL Audio 10 Sealed --- X-Fi Titanium HD - Presonus Eris E5 - JBL 4412
Power Supply Silverstone 1000W
Mouse Roccat Kain 122 AIMO
Keyboard KBD67 Lite / Mammoth75
VR HMD Reverb G2 V2
Software Win 11 Pro
Joined
Jan 3, 2021
Messages
2,764 (2.25/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
@Wikipedia.
Here's the note below that table:
1714032391645.png


I'm not being very exact here, I don't make a distinction between 64 and 63.015. But I'm making a distinction between 64 and 64+64.

Also, this would have much higher latency than say an Optane DIMM, right?
No, it will still be far faster. I've read somewhere (Tom's) that the additional latency is supposed to be similar to when a processor is accessing RAM of the next closest NUMA node (i.e. another processor's RAM in a 2 or 4-processor system). That's more than 100 ns. Of course, DRAM also has ~50 ns on its own.

The remote memory, or in this case, a hybrid RAM/flash memory device, is accessible over the PCIe bus, which comes at the cost of ~170-250ns of latency, or roughly the cost of a NUMA hop.

By the way, what's the latency of dGPU memory when read/written by the CPU? That's a situation somewhat similar to CXL memory because data moves in packets over PCIe.
 
Last edited:
Joined
Jun 1, 2021
Messages
207 (0.19/day)
You're still limited by the PCIe bus speed, which in your example of PCIe x8 would be twice the transfer speed of a nvme drive. So why bother, just stick with nvme drives. Alot more bang for the buck and a less complicated storage hierarchy.

I mean there are use cases for this. Certain types of servers being one of them. But for something like a huge buffer space for games? Those already exist. They are called nvme ssds.

Yes, you would be limited to PCIe x8 speeds, but since those are Gen 5, it would still be 32GB/s. NVMe Drives are struggling to even reach 12 GB/s with those super expensive Gen 5 drives.

NVMe drives have a lot of issues, yes the sequential is fast but what about the Random 4k? The progress there hasn't gone there that much. That was kinda of the whole point of optane actually. Such a solution would give orders of magnitude more performance in quite a few areas and be able to be used in more ways, due to latency, iops, etc.
 
Joined
Aug 22, 2007
Messages
3,464 (0.57/day)
Location
CA, US
System Name :)
Processor Intel 13700k
Motherboard Gigabyte z790 UD AC
Cooling Noctua NH-D15
Memory 64GB GSKILL DDR5
Video Card(s) Gigabyte RTX 4090 Gaming OC
Storage 960GB Optane 905P U.2 SSD + 4TB PCIe4 U.2 SSD
Display(s) Alienware AW3423DW 175Hz QD-OLED + Nixeus 27" IPS 1440p 144Hz
Case Fractal Design Torrent
Audio Device(s) MOTU M4 - JBL 305P MKII w/2x JL Audio 10 Sealed --- X-Fi Titanium HD - Presonus Eris E5 - JBL 4412
Power Supply Silverstone 1000W
Mouse Roccat Kain 122 AIMO
Keyboard KBD67 Lite / Mammoth75
VR HMD Reverb G2 V2
Software Win 11 Pro
No, it will still be far faster. I've read somewhere (Tom's) that the additional latency is supposed to be similar to when a processor is accessing RAM of the next closest NUMA node (i.e. another processor's RAM in a 2 or 4-processor system). That's more than 100 ns. Of course, DRAM also has ~50 ns on its own.
I'm talking about Optane DIMMs not SSDs
Here's an interesting paper that talks about persistent memory/Optane and CXL https://www3.cs.stonybrook.edu/~anshul/dimes23-pmem.pdf
 
Top