Skip to content

Conversation

@chhwang
Copy link
Contributor

@chhwang chhwang commented Jul 31, 2025

  • Allow CudaIpc connection between GPUs in a single process
  • Added an example of connection in a single process
  • Minor interface updates

@chhwang chhwang requested review from Binyang2014 and Copilot July 31, 2025 07:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Extends CudaIpc transport to support connections between GPUs within a single process, whereas previously it only worked across different processes. This enables GPU-to-GPU communication within the same application.

Key Changes

  • Added process ID hash tracking to endpoints for intra-process GPU connections
  • Enabled peer access for same-process GPU connections
  • Updated constants and APIs for better maintainability

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/registered_memory.cc Removed namespace prefix from SysError exception
src/include/endpoint.hpp Added pidHash field to endpoint implementation
src/endpoint.cc Added pidHash serialization and getter methods
src/context.cc Added peer access enablement for same-process GPU connections
src/connection.cc Moved connection constructor to implementation file
python/mscclpp/core_py.cpp Removed deprecated Nvls transport enum value
include/mscclpp/port_channel_device.hpp Replaced macros with constexpr constants
include/mscclpp/gpu.hpp Added HIP compatibility for cudaDeviceEnablePeerAccess
include/mscclpp/core.hpp Removed Nvls transport and updated constants
examples/tutorials/01-basic-concepts/ Added complete GPU ping-pong example with build files
apps/nccl/src/nccl.cu Updated to use new constant naming
Comments suppressed due to low confidence (1)

examples/tutorials/01-basic-concepts/gpu_ping_pong.cu:10

  • [nitpick] Function names 'gpuKernel0' and 'gpuKernel1' are not descriptive. Consider more meaningful names like 'pingKernel' and 'pongKernel' or 'initiatorKernel' and 'responderKernel'.
__global__ void gpuKernel0(mscclpp::BaseMemoryChannelDeviceHandle *devHandle, int iter) {

@Binyang2014 Binyang2014 enabled auto-merge (squash) August 1, 2025 20:39
@Binyang2014 Binyang2014 disabled auto-merge August 1, 2025 20:51
@chhwang chhwang merged commit c580e4c into main Aug 2, 2025
14 checks passed
@chhwang chhwang deleted the chhwang/1-proc-conn branch August 2, 2025 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

SYSTEM_READY >> ...MS