[Python 100일 챌린지] Day 13 - 세트 다루기 (set)

게시 2025/03/13

Day 13 - 세트 다루기

By YonYonWare

28 분읽는 시간

[Python 100일 챌린지] Day 13 - 세트 다루기 (set)

중복을 허용하지 않는 세트(집합)를 배워봅시다! 실무에서 중복 제거, 빠른 검색, 집합 연산(합집합, 교집합) 등에 활용됩니다. 수학의 집합 개념을 프로그래밍으로 구현할 수 있습니다. (25분 완독 ⭐⭐)

🎯 오늘의 학습 목표

세트가 무엇인지 이해하기
세트 만들고 사용하기
세트 기본 연산 익히기
세트 집합 연산 활용하기
세트와 다른 자료구조 비교
실전에서 세트 활용하기

📚 사전 지식

Day 7: 리스트 다루기 (list) - 리스트 생성과 조작 방법
Day 11: 튜플 다루기 (tuple) - 불변 자료구조의 특성
Day 12: 딕셔너리 다루기 (dict) - 키-값 쌍 데이터 구조

🎯 학습 목표 1: 세트가 무엇인지 이해하기

한 줄 설명

세트 = 중복을 허용하지 않고, 순서가 없는 자료구조

수학의 집합(Set)과 같은 개념입니다.

세트의 특징

중복 불가: 같은 값이 두 번 들어갈 수 없음
순서 없음: 인덱스로 접근 불가
빠른 검색: 값이 있는지 빠르게 확인 가능
변경 가능: 값 추가/삭제 가능

실생활 비유

📦 일반 상자 (리스트):
   [사과, 바나나, 사과, 포도]  # 중복 가능

🎒 특별한 주머니 (세트):
   {사과, 바나나, 포도}  # 중복 자동 제거!

왜 세트를 사용할까?

  
# 문제: 설문 응답에서 고유한 답변만 추출
responses = ["사과", "바나나", "사과", "포도", "바나나", "사과"]

# 리스트로 중복 제거 (복잡)
unique = []
for item in responses:
    if item not in unique:
        unique.append(item)

# 세트로 중복 제거 (간단!)
unique = set(responses)
print(unique)  # {'사과', '바나나', '포도'}

🎯 학습 목표 2: 세트 만들고 사용하기

기본 문법

  
# 빈 세트
empty_set = set()  # ⚠️ {}는 빈 딕셔너리!

# 데이터가 있는 세트
fruits = {"사과", "바나나", "포도"}
numbers = {1, 2, 3, 4, 5}

# 리스트에서 세트 만들기
my_list = [1, 2, 2, 3, 3, 3]
my_set = set(my_list)
print(my_set)  # {1, 2, 3} - 중복 제거됨!

중복 자동 제거

  
# 중복된 값은 하나만 남음
numbers = {1, 2, 3, 2, 1, 3}
print(numbers)  # {1, 2, 3}

# 문자열에서 고유한 문자만
text = "hello"
unique_chars = set(text)
print(unique_chars)  # {'h', 'e', 'l', 'o'}

길이 확인

  
fruits = {"사과", "바나나", "포도"}
print(len(fruits))  # 3

세트 컴프리헨션 (Set Comprehension)

리스트 컴프리헨션처럼 세트도 컴프리헨션으로 간결하게 생성할 수 있습니다.

  
# 1~10 중 짝수만 세트로
even_numbers = {x for x in range(1, 11) if x % 2 == 0}
print(even_numbers)  # {2, 4, 6, 8, 10}

# 문자열 리스트에서 길이 4 이상만
words = ["python", "is", "fun", "and", "easy"]
long_words = {word for word in words if len(word) >= 4}
print(long_words)  # {'python', 'easy'}

# 제곱수 세트 만들기
squares = {x**2 for x in range(1, 6)}
print(squares)  # {1, 4, 9, 16, 25}

# 문자열의 고유 문자 (대문자로 변환)
text = "hello world"
unique_chars = {char.upper() for char in text if char.isalpha()}
print(unique_chars)  # {'H', 'E', 'L', 'O', 'W', 'R', 'D'}

🎯 학습 목표 3: 세트 기본 연산 익히기

값 추가

  
fruits = {"사과", "바나나"}

# add() - 하나 추가
fruits.add("포도")
print(fruits)  # {'사과', '바나나', '포도'}

# 중복 추가 시도 (무시됨)
fruits.add("사과")
print(fruits)  # {'사과', '바나나', '포도'} - 변화 없음

여러 값 추가

  
fruits = {"사과", "바나나"}

# update() - 여러 개 추가
fruits.update(["포도", "딸기", "수박"])
print(fruits)  # {'사과', '바나나', '포도', '딸기', '수박'}

# 다른 세트와 병합
more_fruits = {"키위", "망고"}
fruits.update(more_fruits)
print(fruits)  # {'사과', '바나나', '포도', '딸기', '수박', '키위', '망고'}

값 삭제

  
fruits = {"사과", "바나나", "포도"}

# remove() - 값 삭제 (없으면 에러)
fruits.remove("바나나")
print(fruits)  # {'사과', '포도'}

# discard() - 값 삭제 (없어도 에러 안 남)
fruits.discard("딸기")  # 없지만 에러 안 남
print(fruits)  # {'사과', '포도'}

# pop() - 임의의 값 제거하고 반환
item = fruits.pop()
print(f"제거된 항목: {item}")

# clear() - 모두 제거
fruits.clear()
print(fruits)  # set()

in 연산자

  
fruits = {"사과", "바나나", "포도"}

# 포함 여부 확인 (매우 빠름!)
print("사과" in fruits)    # True
print("딸기" in fruits)    # False
print("수박" not in fruits)  # True

🎯 학습 목표 4: 세트 집합 연산 활용하기

합집합 (Union) - 모든 원소

  
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# 방법 1: | 연산자
union = set1 | set2
print(union)  # {1, 2, 3, 4, 5}

# 방법 2: union() 메서드
union = set1.union(set2)
print(union)  # {1, 2, 3, 4, 5}

교집합 (Intersection) - 공통 원소

  
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# 방법 1: & 연산자
intersection = set1 & set2
print(intersection)  # {3}

# 방법 2: intersection() 메서드
intersection = set1.intersection(set2)
print(intersection)  # {3}

차집합 (Difference) - 한쪽에만 있는 원소

  
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# set1에만 있는 것
diff = set1 - set2
print(diff)  # {1, 2}

# set2에만 있는 것
diff = set2 - set1
print(diff)  # {4, 5}

# difference() 메서드
diff = set1.difference(set2)
print(diff)  # {1, 2}

대칭 차집합 (Symmetric Difference) - 교집합 제외

  
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# 한 쪽에만 있는 것 (교집합 제외)
sym_diff = set1 ^ set2
print(sym_diff)  # {1, 2, 4, 5}

# symmetric_difference() 메서드
sym_diff = set1.symmetric_difference(set2)
print(sym_diff)  # {1, 2, 4, 5}

부분집합 및 포함 관계

  
set1 = {1, 2, 3}
set2 = {1, 2, 3, 4, 5}

# 부분집합 확인 (subset)
print(set1.issubset(set2))     # True (set1이 set2의 부분집합)
print(set1 <= set2)             # True (동일한 의미)

# 진부분집합 (자기 자신 제외)
print(set1 < set2)              # True (set1이 set2의 진부분집합)

# 상위집합 확인 (superset)
print(set2.issuperset(set1))   # True (set2가 set1의 상위집합)
print(set2 >= set1)             # True (동일한 의미)

# 진상위집합 (자기 자신 제외)
print(set2 > set1)              # True (set2가 set1의 진상위집합)

# 교집합 없음 확인 (disjoint)
set3 = {6, 7, 8}
print(set1.isdisjoint(set3))   # True (공통 요소 없음)
print(set1.isdisjoint(set2))   # False (공통 요소 있음)

실용 예제:

  
# 관리자 권한 확인
required_permissions = {"read", "write", "delete"}
user_permissions = {"read", "write", "delete", "admin"}

# 사용자가 필요한 권한을 모두 가졌는지 확인
if required_permissions.issubset(user_permissions):
    print("✅ 권한이 충분합니다.")
else:
    missing = required_permissions - user_permissions
    print(f"❌ 부족한 권한: {missing}")

🎯 학습 목표 5: 세트와 다른 자료구조 비교

세트 vs 리스트 vs 딕셔너리

특징	리스트	세트	딕셔너리
순서	✅ 있음	❌ 없음	✅ 있음 (3.7+)
중복	✅ 허용	❌ 불허	키는 불허
인덱스	✅ 가능	❌ 불가	키로 접근
검색 속도	느림	⚡ 빠름	⚡ 빠름
용도	순서 중요	중복 제거	키-값 쌍

  
# 리스트
my_list = [1, 2, 2, 3]
print(my_list[0])  # 1

# 세트
my_set = {1, 2, 2, 3}
# print(my_set[0])  # 에러! 인덱스 없음

# 딕셔너리
my_dict = {"a": 1, "b": 2}
print(my_dict["a"])  # 1

반복문으로 순회

  
fruits = {"사과", "바나나", "포도"}

# for문으로 순회
for fruit in fruits:
    print(fruit)

# ⚠️ 순서는 보장되지 않음!

# 정렬이 필요하면 리스트로 변환
for fruit in sorted(fruits):
    print(fruit)

자료구조 변환

  
# 리스트 → 세트 (중복 제거)
numbers_list = [1, 2, 2, 3, 3, 4]
numbers_set = set(numbers_list)  # set() 생성자 사용
print(numbers_set)  # {1, 2, 3, 4}

# 문자열 → 세트 (고유 문자만)
text = "hello"
unique_chars = set(text)  # set() 생성자 사용
print(unique_chars)  # {'h', 'e', 'l', 'o'}

🎯 학습 목표 6: 실전에서 세트 활용하기

예제 1: 중복 제거

  
# 설문 응답에서 중복 제거
responses = ["사과", "바나나", "사과", "포도", "바나나", "사과"]

unique_responses = set(responses)
print(f"응답 종류: {unique_responses}")
print(f"고유 응답 수: {len(unique_responses)}개")

출력:

응답 종류: {'사과', '바나나', '포도'}
고유 응답 수: 3개

예제 2: 공통 취미 찾기

  
person1_hobbies = {"독서", "운동", "영화", "음악"}
person2_hobbies = {"게임", "운동", "영화", "요리"}

# 공통 취미
common = person1_hobbies & person2_hobbies
print(f"공통 취미: {common}")

# 모든 취미
all_hobbies = person1_hobbies | person2_hobbies
print(f"전체 취미: {all_hobbies}")

# person1만의 취미
only_person1 = person1_hobbies - person2_hobbies
print(f"person1만의 취미: {only_person1}")

출력:

공통 취미: {'운동', '영화'}
전체 취미: {'독서', '운동', '영화', '음악', '게임', '요리'}
person1만의 취미: {'독서', '음악'}

예제 3: 로또 번호 생성

  
import random

# 1~45 중 6개 추출 (중복 없이)
lotto = set()

while len(lotto) < 6:
    number = random.randint(1, 45)
    lotto.add(number)

print(f"로또 번호: {sorted(lotto)}")

예제 4: 수강 신청 시스템

  
# 각 과목 수강생
class_a = {"홍길동", "김철수", "이영희", "박민수"}
class_b = {"김철수", "이영희", "최영수", "정민지"}
class_c = {"홍길동", "박민수", "최영수", "강호동"}

# 1. 전체 수강생 (합집합)
all_students = class_a | class_b | class_c
print(f"전체 수강생: {len(all_students)}명")
print(all_students)

# 2. 세 과목 모두 수강하는 학생 (교집합)
all_classes = class_a & class_b & class_c
print(f"세 과목 모두 수강: {all_classes}")

# 3. A만 수강하는 학생
only_a = class_a - class_b - class_c
print(f"A만 수강: {only_a}")

# 4. A와 B 둘 다 수강하는 학생
a_and_b = class_a & class_b
print(f"A와 B 둘 다 수강: {a_and_b}")

💡 실전 팁 & 주의사항

💡 Tip 1: 변경 불가능한 값만 저장

  
# ✅ 가능
good_set = {1, "hello", (1, 2)}

# ❌ 불가능 (리스트, 딕셔너리는 세트에 넣을 수 없음)
# bad_set = {[1, 2, 3]}  # TypeError!
# bad_set = {{"a": 1}}   # TypeError!

💡 Tip 2: 빈 세트 생성 주의

  
# ❌ 잘못된 방법 (빈 딕셔너리가 됨!)
empty = {}
print(type(empty))  # <class 'dict'>

# ✅ 올바른 방법
empty_set = set()
print(type(empty_set))  # <class 'set'>

💡 Tip 3: 순서 보장 안 됨

  
my_set = {3, 1, 2}
print(my_set)  # 매번 다른 순서로 출력될 수 있음

# 정렬이 필요하면 리스트로 변환
sorted_list = sorted(my_set)
print(sorted_list)  # [1, 2, 3]

💡 Tip 4: 빠른 검색에 활용

  
# 리스트에서 검색 (느림)
my_list = list(range(1000000))
print(999999 in my_list)  # 느림

# 세트에서 검색 (빠름!)
my_set = set(range(1000000))
print(999999 in my_set)  # 훨씬 빠름!

💡 Tip 5: remove vs discard

  
fruits = {"사과", "바나나"}

# remove() - 없으면 에러
# fruits.remove("포도")  # KeyError!

# discard() - 없어도 괜찮음
fruits.discard("포도")  # 에러 안 남

💡 Tip 6: 집합 연산 기호

  
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# | : 합집합 (Union)
print(set1 | set2)  # {1, 2, 3, 4, 5}

# & : 교집합 (Intersection)
print(set1 & set2)  # {3}

# - : 차집합 (Difference)
print(set1 - set2)  # {1, 2}

# ^ : 대칭 차집합 (Symmetric Difference)
print(set1 ^ set2)  # {1, 2, 4, 5}

💡 Tip 7: 변경 불가능한 세트 (frozenset)

  
# frozenset - 한번 생성하면 변경 불가
immutable_set = frozenset([1, 2, 3])
print(immutable_set)  # frozenset({1, 2, 3})

# 추가/삭제 불가능
# immutable_set.add(4)  # AttributeError! 추가 불가

# 일반 세트는 딕셔너리 키로 사용 불가
# my_dict = {set([1, 2]): "value"}  # TypeError!

# frozenset은 딕셔너리 키로 사용 가능
my_dict = {frozenset([1, 2]): "value"}
print(my_dict)  # {frozenset({1, 2}): 'value'}

# frozenset 간 집합 연산 가능
fs1 = frozenset([1, 2, 3])
fs2 = frozenset([3, 4, 5])
print(fs1 | fs2)  # frozenset({1, 2, 3, 4, 5})

🧪 연습 문제

문제 1: 텍스트 데이터 중복 제거 및 분석

과제: 문장에서 중복 단어를 제거하고 다양한 통계를 출력하세요.

초기 데이터:

  
text = "python is fun python is easy python makes coding fun coding is powerful"

요구사항:

문장을 단어로 분리하여 리스트로 저장하세요
중복을 제거한 고유 단어들을 세트로 저장하세요
전체 단어 수(중복 포함)와 고유 단어 수를 출력하세요
중복된 단어만 추출하여 출력하세요 (2번 이상 등장한 단어)
고유 단어를 알파벳 순으로 정렬하여 출력하세요

💡 힌트

split()으로 단어 분리
set()으로 중복 제거
리스트의 count() 메서드로 빈도 확인
sorted()로 정렬

✅ 정답

  
text = "python is fun python is easy python makes coding fun coding is powerful"

# 1. 문장을 단어로 분리
words = text.split()
print(f"전체 단어: {words}")

# 2. 중복 제거
unique_words = set(words)
print(f"\n고유 단어: {unique_words}")

# 3. 통계 출력
total_count = len(words)
unique_count = len(unique_words)
duplicate_count = total_count - unique_count
print(f"\n=== 통계 ===")
print(f"전체 단어 수: {total_count}개")
print(f"고유 단어 수: {unique_count}개")
print(f"중복된 단어 수: {duplicate_count}개")

# 4. 중복된 단어만 추출 (2번 이상 등장)
duplicates = set()
for word in unique_words:
    if words.count(word) >= 2:
        duplicates.add(word)

print(f"\n중복 단어: {duplicates}")
for word in duplicates:
    print(f"  - '{word}': {words.count(word)}번")

# 5. 알파벳 순 정렬
sorted_words = sorted(unique_words)
print(f"\n알파벳 순 정렬: {sorted_words}")

출력:

전체 단어: ['python', 'is', 'fun', 'python', 'is', 'easy', 'python', 'makes', 'coding', 'fun', 'coding', 'is', 'powerful']

고유 단어: {'python', 'is', 'fun', 'easy', 'makes', 'coding', 'powerful'}

=== 통계 ===
전체 단어 수: 13개
고유 단어 수: 7개
중복된 단어 수: 6개

중복 단어: {'python', 'is', 'fun', 'coding'}
  - 'python': 3번
  - 'is': 3번
  - 'fun': 2번
  - 'coding': 2번

알파벳 순 정렬: ['coding', 'easy', 'fun', 'is', 'makes', 'powerful', 'python']

문제 2: 친구 관계 분석 시스템

과제: 여러 사람의 친구 목록을 세트로 관리하고 집합 연산을 수행하세요.