Zico : efficient GPU memory sharing for concurrent DNN trainingg / Gangmuk Lim

입법지원서비스

의정활동에 필요한 자료를 어디서 찾을지 고민되셨다면, 입법 지원서비스 메뉴를 확인해보세요. 국회도서관에서 제공하는 의회·법률정보부터 AI 분석까지, 국회의 입법 활동을 뒷받침하는 전문정보를 모았습니다.

국회도서관 홈으로 정보검색 소장정보 검색

결과 내 검색

동의어 포함

고급검색

상세검색
저자 검색
관련 키워드 검색
주제별 검색

완전일치
전방일치
후방일치

인명/단체명

	저자정보	상세정보
인명/단체명을 입력하세요.

전방일치
완전일치
후방일치
부분일치

키워드

대표어
외국어
네이버 백과사전

용어관계 검색결과
대표어	동의어	상위어	하위어	관련어	대립어

대분류

중분류

소분류

소장자료
외부기관 자료

학위논문 Zico : efficient GPU memory sharing for concurrent DNN trainingg

저자명
Gangmuk Lim
발행사항
울산 : 울산과학기술원 대학원, 2021.8
청구기호
TM 621.39 -21-613
형태사항
32 p. ; 30 cm
자료실 서울관 서고(열람신청 후 1층 대출대)
제어번호
KDMT12021000060112
주기사항
학위논문(석사) -- 울산과학기술원 대학원, Dept. of Computer Science and Engineering, 2021.8. 지도교수: Myeongjae Jeon
연계정보
외부기관 원문

목차보기

Title Page

Abstract

Contents

I. Introduction 9

1.1. Contribution 11

1.2. Organization 11

II. Background 13

2.1. Deep Neural Network Training 13

2.2. GPU Sharing Use Cases 13

2.3. Spatial GPU Sharing 14

III. Challenges for Memory Sharing 15

3.1. Memory Bloating 15

3.2. Workload Variability 16

3.3. Asynchrony with GPU Processing 16

IV. Design Overview 18

V. Scheduling Algorithm 21

5.1. Problem Definition 21

5.2. Time Shift Model 21

5.3. Memory Sharing Algorithm 22

VI. Memory Management with Concurrency 25

6.1. Tracking Memory Usage in GPU 25

6.2. Tensor Classification 26

6.3. Managing Memory Regions 26

VII. Evaluation 28

7.1. Training Same Models 28

7.2. Training Non-identical Models 30

7.3. Dynamic Memory Budget Change 31

7.4. Design Validation 32

VIII. Discussion 34

IX. Related Work 35

X. Concluding Remarks 36

References 37

List of Figures

Figure 1. Cumulative distribution of NASNet tensor lifespan. 15

Figure 2. Memory usage patterns for different DNN models over time. 15

Figure 3. GPU memory usage from CPU view (ResNet-50). 17

Figure 4. System architecture in Zico. 18

Figure 5. Throughput in training the same models.... 28

Figure 6. Aggregated memory usage for training the same models.... 29

Figure 7. Memory usage over time for training the same models.... 29

Figure 8. Throughput in training the distinct models concurrently.... 30

Figure 9. Memory usage over time for training NASNet and ResNet-110 concurrently. 30

Figure 10. Memory usage patterns on dynamic memory budget changes. 31

초록보기

GPUs are the workhorse in modern server infrastructure fueling advances in a number of compute-intensive workloads such as deep neural network (DNN) training. Several recent works propose solutions on sharing GPU resources across multiple concurrent DNN training jobs, but none of them address rapidly increasing memory footprint introduced by such job co-locations, which greatly limit the effectiveness of sharing GPU resources. In this paper, Zico, the first DNN system, is presented which aims at reducing the system-wide memory consumption for concurrent training. Zico keeps track of the memory usage pattern of individual training job by monitoring its progress on GPU computations and makes memory reclaimed from the job globally sharable. Based on this memory management scheme, Zico automatically decides a strategy to share memory among concurrent jobs with minimum delay on training while not exceeding a given memory budget such as GPU memory capacity. Our evaluation shows that Zico outperforms existing GPU sharing approaches and delivers benefits over a variety of job co-location scenarios.

자료명
저자사항
제어번호
*요청자 이름
*전화번호	휴대폰 번호를 입력하세요.
*이메일	@
*요청내용
*오류항목

* 서재명
설명
* 공개수준	비공개 완전공개 * 주의: 국회도서관 이용자 모두에게 공유서재로 서비스 됩니다.

고급검색

다국어입력

학위논문 Zico : efficient GPU memory sharing for concurrent DNN trainingg

목차보기

초록보기

추천서가 (다양한 추천 자료를 만나보세요)

권호

알림톡 발송로 자료명, 기사명/저자명, 수록지명, 자료실, 서가번호, 전화번호로 구성되어 있습니다.




전화번호

연속간행물 상세정보 입니다.
청구기호
자료명/저자사항
발행사항
형태사항
ISSN

고급검색

다국어입력

학위논문 Zico : efficient GPU memory sharing for concurrent DNN trainingg

목차보기

초록보기

추천서가 (다양한 추천 자료를 만나보세요)

MARC 보기

오류 데이터 정정요청

알림톡 발송

권호기사보기

연속간행물 권호 선택

연속간행물 권호 선택

우편복사 안내

도서위치안내(서울관)

저자프로필

목차보기

우편복사 안내

우편복사 목록담기

확인

내서재에 담기

새로운 서재

저장

로그인

권호