Dockerfile-default-rocm split into two separate files mainly aimed at different users.#272
Open
Dockerfile-default-rocm split into two separate files mainly aimed at different users.#272
Conversation
…eanup needed in the Makefile.
| ARG BASE_IMAGE | ||
| FROM ${BASE_IMAGE} | ||
|
|
||
| # MAY NOT BE IMPORTANT ANYMORE |
Contributor
Author
There was a problem hiding this comment.
I guess we can remove it and if something breaks then we add it back in.
|
|
||
|
|
||
| # THIS FIX IS FOR SAWMILL, UNCLEAR IF NECESSARY FOR GENERAL USERS | ||
| #TODO: is this necessary? |
| @@ -0,0 +1,55 @@ | |||
| ARG BASE_IMAGE | |||
| FROM ${BASE_IMAGE} | |||
| #why no highlighting? | |||
|
|
||
| # LIBFABRIC ISSUE | ||
| # USE CONDA FOR WORKAROUND | ||
| #TODO: MAY NOT BE A PROBLEM ANYMORE? |
| #TODO: finish iterating here, preferably turn it into a shell script. | ||
| RUN if [ -n "$DEEPSPEED_PIP" ]; then DEBIAN_FRONTEND=noninteractive apt-get install -y pdsh libaio-dev&& git clone https://github.com/ROCmSoftwarePlatform/triton.git && cd triton && git checkout triton-mlir && cd python && pip3 install ninja cmake && python setup.py install;fi | ||
| RUN if [ -n "$DEEPSPEED_PIP" ]; then DEBIAN_FRONTEND=noninteractive apt-get install -y pdsh libaio-dev&& python -m pip install pydantic==1.10.11 && git clone https://github.com/ROCmSoftwarePlatform/DeepSpeed.git && cd DeepSpeed && python3 setup.py build && python3 setup.py install && python -m deepspeed.env_report; fi | ||
| RUN if [ -n "$DEEPSPEED_PIP" ]; then python -m deepspeed.env_report ; fi |
There was a problem hiding this comment.
This deepspeed section definitely needs cleanup
| INFINITYHUB_PYTORCH_PREFIX := rocm/pytorch | ||
| INFINITYHUB_TENSORFLOW_PREFIX := rocm/tensorflow | ||
| INFINITYHUB_PYTORCH_VERSION := 2.1.2 | ||
| INFINITYHUB_TENSORFLOW_VERSION := |
MikhailKardash
left a comment
There was a problem hiding this comment.
Still needs CI build integration as well, so I expect there to be .circleci code changes.
jgongd
reviewed
Jul 16, 2024
| ROCM_57_PREFIX := $(REGISTRY_REPO):rocm-5.7- | ||
| ROCM_60_PREFIX := $(REGISTRY_REPO):rocm-6.0- | ||
| ROCM_61_PREFIX := $(REGISTRY_REPO):rocm-6.1- | ||
| ROCM_60_TF_PREFIX := tensorflow-infinity-hub:tensorflow-infinity-hub |
Contributor
There was a problem hiding this comment.
Why other images are stored in REGISTRY_REPO := environments, or a repo with a -dev suffix, but this one is not?
jgongd
reviewed
Jul 16, 2024
Contributor
There was a problem hiding this comment.
How do we test these images?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Checklist
bumpenvsprocedure in the determined repo. See README.