AWS SQS 완전 가이드: Message Group, 활용 패턴, 메시징 전략

결론

AWS SQS (Simple Queue Service)는 마이크로서비스, 분산 시스템, 서버리스 애플리케이션을 디커플링하고 확장할 수 있도록 지원하는 완전 관리형 메시지 대기열 서비스입니다 AWS 공식 문서 (opens in a new tab)

AWS 공식 문서에서 "Amazon Simple Queue Service(SQS)는 마이크로 서비스, 분산 시스템 및 서버리스 애플리케이션을 쉽게 분리하고 확장할 수 있도록 지원하는 완전관리형 메시지 대기열 서비스이다"라고 명시되어 있음.

Message Group은 FIFO 큐에서 메시지를 논리적인 그룹으로 조직화하는 필수 파라미터로, 같은 그룹 내에서는 엄격한 순서를 보장하면서도 다른 그룹 간에는 병렬 처리를 가능하게 합니다 AWS 공식 문서 - Message Group ID (opens in a new tab)

AWS 공식 문서에서 "같은 메시지 그룹 내의 메시지들은 항상 엄격한 순서로 한 번에 하나씩 처리되며, 같은 그룹의 두 메시지가 동시에 처리되지 않도록 보장합니다"라고 명시되어 있음.

SQS의 주요 활용 패턴으로는 Work Queues (컴포넌트 디커플링), Buffer and Batch Operations (트래픽 버스트 완화), Request-Response Pattern (비동기 통신), SNS + SQS Fanout (병렬 메시지 배포) 등이 있습니다 AWS 아키텍처 블로그 (opens in a new tab)

AWS 블로그에서 "Amazon SQS supports several common design patterns including Work Queues for decoupling components, Buffer and Batch Operations for adding scalability, Request Offloading for moving slow operations off interactive request paths, and Fanout which combines SQS with SNS"라고 명시되어 있음.

1. AWS SQS 개요

1.1 핵심 특징

Amazon SQS는 메시지의 안전을 위해 여러 서버에 메시지를 저장하며, 중복 인프라를 사용하여 고가용성을 제공합니다 AWS SQS 공식 페이지 (opens in a new tab)

AWS 공식 문서에서 "Amazon SQS는 메시지의 안전을 위해 Amazon SQS는 메시지를 여러 서버에 저장합니다. Amazon SQS는 중복 인프라를 사용하여 메시지에 대한 고도의 동시 액세스와 메시지 생성 및 소비에 대한 고가용성을 제공합니다"라고 명시되어 있음.

주요 기능:

메시지 크기: 최대 256KB의 텍스트
배치 처리: 최대 10개 메시지 또는 256KB까지 한 번에 처리
내구성: 메시지를 여러 서버에 분산 저장
확장성: 무제한에 가까운 처리량 지원 (Standard Queue)

1.2 메시지 처리 흐름

Producer → SQS Queue → Consumer
           (메시지 저장)   (메시지 수신 및 처리)

2. 큐 타입 비교: Standard vs FIFO

2.1 Standard Queue

Standard Queue는 무제한에 가까운 처리량을 제공하는 것이 특징입니다 AWS SQS 큐 타입 비교 (opens in a new tab)

해당 블로그에서 "무제한에 가까운 처리량(throughput)을 제공하는 것이 특징"이라고 명시되어 있음.

특징:

처리량: 무제한 (초당 수천 ~ 수백만 메시지)
전달 보장: At-least-once delivery (최소 1회 이상 전달)
순서: Best-effort ordering (순서 보장 없음)
In-flight 메시지: 큐당 최대 120,000개
중복: 동일 메시지가 여러 번 전달될 수 있음

In-flight 메시지란?

In-flight 메시지는 큐에서 소비자에게 이미 전달되었지만 아직 삭제되지 않은 메시지를 의미합니다 AWS 공식 문서 - Visibility Timeout (opens in a new tab)

AWS 공식 문서에서 "Messages are inflight after they have been received from the queue by a consuming component, but have not yet been deleted from the queue"라고 명시되어 있음.

동작 방식:

소비자가 ReceiveMessage를 호출하면 메시지가 in-flight 상태로 변경됩니다
Visibility timeout 동안 다른 소비자에게 보이지 않습니다
메시지를 처리한 후 DeleteMessage를 호출하면 큐에서 완전히 제거됩니다
Visibility timeout이 만료되면 메시지가 다시 큐에 표시됩니다

Standard Queue는 최대 120,000개의 in-flight 메시지를 지원하며, 이는 동시에 처리할 수 있는 메시지의 최대 개수를 의미합니다 AWS re:Post - In-flight Messages (opens in a new tab)

사용 시나리오:

높은 처리량이 필요한 경우
메시지 순서가 중요하지 않은 경우
멱등성(idempotent) 처리가 가능한 애플리케이션

AWS SQS FAQ (opens in a new tab)에서 "Decouple live user requests from intensive background work: Let users upload media while resizing or encoding it"라고 사용 예시를 명시하고 있음.

2.2 FIFO Queue

FIFO 큐는 메시지가 정확히 한 번만 처리되며, 전송된 순서대로 정확하게 수신됨을 보장합니다 AWS SQS Standard vs FIFO (opens in a new tab)

해당 문서에서 "Exactly-Once Processing: 메시지가 정확히 한 번만 처리되며, First-In-First-Out Delivery: 메시지가 전송된 순서대로 정확하게 수신됨을 보장합니다"라고 명시되어 있음.

특징:

처리량: 제한적 (배치 사용 시 초당 최대 3,000개 메시지)
전달 보장: Exactly-once processing (정확히 1회 처리)
순서: FIFO (완전한 순서 보장)
In-flight 메시지: 큐당 최대 120,000개 (2024년 11월 업데이트)
중복 제거: 5분 윈도우 내 자동 중복 제거

In-flight 메시지 제한 증가

2024년 11월, Amazon SQS는 FIFO 큐의 in-flight 메시지 제한을 20,000개에서 120,000개로 증가시켰습니다 AWS What's New - SQS FIFO In-flight Limit (opens in a new tab)

AWS 공식 발표에서 "With this change to the in-flight limit, your receivers can now process a maximum of 120K messages concurrently, increased from 20K previously, via SQS FIFO queues"라고 명시되어 있음.

이는 충분한 발행 처리량이 있고 이전 20K in-flight 제한으로 인해 제약을 받았던 경우, 이제 수신자를 확장하여 한 번에 최대 120K 메시지를 처리할 수 있음을 의미합니다.

사용 시나리오:

메시지 순서가 중요한 경우
중복 처리가 불가능한 경우
금융 거래, 주문 처리 등

AWS SQS FIFO vs Standard (opens in a new tab)에서 "FIFO queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated"라고 명시되어 있음.

2.3 비교표

항목	Standard Queue	FIFO Queue
처리량	무제한	초당 3,000개 (배치)
전달 보장	At-least-once	Exactly-once
순서 보장	없음	완전 보장
In-flight	120,000개	20,000개
비용	저렴	비쌈
중복	가능	자동 제거

3. Message Group 상세 설명

3.1 Message Group ID란?

MessageGroupId는 FIFO 큐에서 메시지를 별개의 그룹으로 조직화하는 속성입니다 AWS 공식 문서 - FIFO Delivery Logic (opens in a new tab)

3.2 동작 방식

필수 파라미터

각 메시지에 메시지 그룹 ID를 연결해야 하며, 그룹 ID 없이 메시지를 보내면 작업이 실패합니다 AWS 공식 문서 (opens in a new tab)

AWS 공식 문서에서 "각 메시지에 메시지 그룹 ID를 연결해야 하며, 그룹 ID 없이 메시지를 보내면 작업이 실패합니다"라고 명시되어 있음.

그룹 내 순서 보장

같은 메시지 그룹 ID 내에서 모든 메시지는 엄격한 순서로 전송 및 수신되지만, 다른 메시지 그룹 ID를 가진 메시지들은 서로에 대해 순서가 맞지 않게 도착하거나 처리될 수 있습니다

병렬 처리

다른 메시지 그룹 ID의 메시지를 여러 소비자가 동시에 처리할 수 있습니다

동작 흐름 다이어그램

핵심 동작 원칙:

그룹별 순서 보장: 같은 Message Group ID를 가진 메시지들은 발행된 순서대로 정확히 전달됩니다
그룹 내 직렬 처리: 같은 그룹 내 메시지는 이전 메시지가 삭제되기 전까지 다음 메시지가 전달되지 않습니다
그룹 간 병렬 처리: 서로 다른 Message Group ID를 가진 메시지들은 독립적으로 동시에 처리될 수 있습니다
메시지 삭제 필수: 다음 메시지를 받으려면 현재 메시지를 명시적으로 삭제해야 합니다

3.3 메시지 수신 동작

중요: ReceiveMessage에서 MessageGroupId를 지정할 수 없습니다

많은 개발자가 오해하는 부분이지만, ReceiveMessage API에는 MessageGroupId 파라미터가 없습니다 AWS re:Post - Receive Messages with specific Message Group Id (opens in a new tab)

AWS re:Post에서 "You cannot request or filter ReceiveMessage API calls for a specific message group ID when using FIFO queues"라고 명시되어 있음.

메시지 발송 vs 메시지 수신:

# ✅ 발송 시: MessageGroupId 설정 가능
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order details',
    MessageGroupId='customer-123',  # 그룹 지정 가능
    MessageDeduplicationId='order-456'
)
 
# ❌ 수신 시: MessageGroupId 필터링 불가능
response = sqs.receive_message(
    QueueUrl=queue_url,
    MaxNumberOfMessages=10,
    # MessageGroupId 파라미터 자체가 없음!
    AttributeNames=['MessageGroupId']  # ⚠️ 이것은 필터가 아님!
)

AttributeNames는 필터가 아닙니다

AttributeNames=['MessageGroupId']는 응답에 MessageGroupId 속성을 포함시켜달라는 의미이지, MessageGroupId로 필터링하는 것이 아닙니다.

실제 수신 동작:

ReceiveMessage API 호출에서 MaxNumberOfMessages 파라미터를 지정하면, SQS는 가능한 한 같은 메시지 그룹 ID를 가진 많은 메시지를 반환하며, 다른 사용 가능한 메시지 그룹의 메시지도 반환할 수 있습니다 Medium - AWS FIFO Queues (opens in a new tab)

해당 블로그에서 "When you specify the MaxNumberOfMessages parameter in a ReceiveMessage API call, SQS will try to return as many messages with the same message group ID as possible, and may also return messages from other available message groups"라고 명시되어 있음.

예시 시나리오:

# 큐 상태:
# - customer-123: 5개 메시지
# - customer-456: 3개 메시지
# - customer-789: 2개 메시지
 
# Consumer가 receive_message(MaxNumberOfMessages=10) 호출
response = sqs.receive_message(
    QueueUrl=queue_url,
    MaxNumberOfMessages=10,
    AttributeNames=['MessageGroupId']
)
 
# 가능한 결과 1: 여러 그룹의 메시지가 섞여서 옴
# [
#     {'MessageGroupId': 'customer-123', ...},
#     {'MessageGroupId': 'customer-123', ...},
#     {'MessageGroupId': 'customer-456', ...},
#     {'MessageGroupId': 'customer-789', ...},
#     ...
# ]
 
# 가능한 결과 2: SQS가 자동으로 선택
# [
#     {'MessageGroupId': 'customer-123', ...},
#     {'MessageGroupId': 'customer-456', ...},
# ]

핵심 정리:

MessageGroupId는 발송 시에만 지정할 수 있습니다
ReceiveMessage는 여러 MessageGroupId의 메시지를 반환할 수 있습니다
Consumer는 어떤 그룹의 메시지를 받을지 선택할 수 없으며, SQS가 자동으로 선택합니다
특정 그룹의 메시지만 처리하고 싶다면, 수신 후 애플리케이션 레벨에서 필터링해야 합니다

중요 제약사항:

특정 메시지 그룹 ID에 속한 메시지를 수신할 때, 같은 메시지 그룹 ID 내에서 더 많은 메시지를 수신하려면 현재 메시지 그룹 ID의 메시지를 먼저 삭제해야 합니다

3.4 사용 예시

import boto3
 
sqs = boto3.client('sqs')
queue_url = 'https://sqs.region.amazonaws.com/account-id/queue-name.fifo'
 
# 고객별로 주문 메시지를 그룹화
response = sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order details',
    MessageGroupId='customer-123',  # 고객 ID를 그룹 ID로 사용
    MessageDeduplicationId='order-456'
)

시나리오: 주문 처리 시스템

customer-123의 모든 주문은 순서대로 처리
customer-456의 주문은 customer-123과 독립적으로 병렬 처리
같은 고객의 주문은 절대 동시에 처리되지 않음

3.5 MessageGroupId와 MessageDeduplicationId의 관계

중요: 두 개념은 서로 독립적입니다

많은 사람들이 MessageDeduplicationId가 같은 MessageGroupId 내에서만 중복을 제거한다고 오해하지만, 중복 제거는 큐 전체에서 적용됩니다.

MessageGroupId (순서 보장 범위):

메시지를 논리적 그룹으로 조직화
같은 그룹 내에서 순서 보장
그룹 간에는 병렬 처리 가능

MessageDeduplicationId (중복 제거 범위):

큐 전체에서 5분 윈도우 내 중복 메시지 전달 방지
MessageGroupId와 무관하게 동작
다른 그룹에서도 같은 DeduplicationId를 사용하면 중복으로 간주

예시:

# Message 1: 큐에 저장됨
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order A',
    MessageGroupId='customer-123',
    MessageDeduplicationId='order-001'  # 최초 메시지
)
 
# Message 2: ❌ 중복 제거됨 (같은 그룹, 같은 DedupID)
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order A',
    MessageGroupId='customer-123',
    MessageDeduplicationId='order-001'  # 5분 내 중복
)
 
# Message 3: ❌ 중복 제거됨 (다른 그룹이지만, 같은 DedupID)
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order A',
    MessageGroupId='customer-456',  # 다른 그룹!
    MessageDeduplicationId='order-001'  # 하지만 여전히 중복
)
 
# Message 4: ✅ 큐에 저장됨 (다른 DedupID)
sqs.send_message(
    QueueUrl=queue_url,
    MessageBody='Order B',
    MessageGroupId='customer-456',
    MessageDeduplicationId='order-002'  # 다른 DeduplicationId
)

핵심 정리:

MessageGroupId는 순서 보장과 병렬 처리를 제어합니다
MessageDeduplicationId는 큐 전체에서 중복 전달을 방지합니다
두 값은 서로 독립적으로 동작하며, 다른 목적을 가집니다

4. SQS SDK 설정 파라미터

4.1 SendMessage API 파라미터

SendMessage API는 큐에 메시지를 전송할 때 사용되며, 다양한 설정 옵션을 제공합니다 AWS API Reference - SendMessage (opens in a new tab)

필수 파라미터:

파라미터	설명	제약사항
`QueueUrl`	메시지를 전송할 큐의 URL	큐 URL과 이름은 대소문자를 구분합니다
`MessageBody`	전송할 메시지 내용	최대 256KB, XML/JSON/일반 텍스트 지원

선택적 파라미터:

파라미터	설명	기본값	제약사항
`DelaySeconds`	메시지 전달 지연 시간 (초)	0	0-900초, FIFO 큐에서는 사용 불가
`MessageAttributes`	사용자 정의 메시지 속성	-	DataType (String/Number), StringValue 지정
`MessageDeduplicationId`	FIFO 큐 중복 제거 ID	-	FIFO 큐에서 필수 (또는 Content-Based Deduplication 활성화)
`MessageGroupId`	FIFO 큐 메시지 그룹 ID	-	FIFO 큐에서 필수

AWS 공식 문서에서 "MessageDeduplicationId is required for FIFO queues unless content-based deduplication is enabled"라고 명시되어 있음.

4.2 ReceiveMessage API 파라미터

ReceiveMessage API는 큐에서 메시지를 수신할 때 사용되며, 한 번에 최대 10개의 메시지를 가져올 수 있습니다 AWS API Reference - ReceiveMessage (opens in a new tab)

필수 파라미터:

파라미터	설명
`QueueUrl`	메시지를 수신할 큐의 URL

선택적 파라미터:

파라미터	설명	기본값	범위/옵션
`MaxNumberOfMessages`	한 번에 수신할 최대 메시지 개수	1	1-10
`VisibilityTimeout`	메시지 가시성 타임아웃 (초)	큐 기본값	0-43,200 (12시간)
`WaitTimeSeconds`	Long Polling 대기 시간 (초)	0	0-20 (0은 Short Polling)
`AttributeNames`	반환할 메시지 속성	-	All, SenderId, SentTimestamp, ApproximateReceiveCount 등
`MessageAttributeNames`	반환할 사용자 정의 메시지 속성	-	특정 속성명 또는 All/*

AWS 공식 문서에서 "Using the WaitTimeSeconds parameter enables long-poll support"라고 명시되어 있음.

주요 속성 (AttributeNames):

All: 모든 속성 반환
SenderId: 메시지 발신자 ID
SentTimestamp: 메시지 전송 시각 (epoch 시간)
ApproximateReceiveCount: 메시지가 수신된 대략적인 횟수
ApproximateFirstReceiveTimestamp: 메시지가 처음 수신된 시각
MessageDeduplicationId: FIFO 큐 중복 제거 ID
MessageGroupId: FIFO 큐 메시지 그룹 ID
SequenceNumber: FIFO 큐에서 메시지의 순서 번호
AWSTraceHeader: AWS X-Ray 추적 헤더
DeadLetterQueueSourceArn: DLQ의 원본 큐 ARN

ARN(Amazon Resource Name)이란?

ARN은 AWS 리소스를 고유하게 식별하는 표준 형식입니다 AWS 공식 문서 - ARN (opens in a new tab)

AWS 공식 문서에서 "Amazon Resource Names (ARNs) uniquely identify AWS resources. We require an ARN when you need to specify a resource unambiguously across all of AWS, such as in IAM policies, Amazon Relational Database Service (Amazon RDS) tags, and API calls"라고 명시되어 있음.

ARN 형식:

arn:partition:service:region:account-id:resource-id

구성 요소:

partition: 표준 AWS 리전의 경우 aws (중국 리전은 aws-cn)
service: AWS 서비스 (예: sqs, ec2, iam)
region: 리소스가 위치한 리전 (글로벌 리소스의 경우 생략)
account-id: 리소스를 소유한 AWS 계정 ID (12자리 숫자)
resource-id: 리소스의 고유 식별자

SQS ARN 예시:

arn:aws:sqs:us-east-1:123456789012:MyQueue
arn:aws:sqs:us-east-1:123456789012:MyQueue.fifo

4.3 큐 속성 (Queue Attributes)

큐 생성 또는 수정 시 설정할 수 있는 주요 속성들입니다 AWS API Reference - SetQueueAttributes (opens in a new tab)

속성	설명	기본값	범위
`VisibilityTimeout`	기본 가시성 타임아웃	30초	0초-12시간
`MessageRetentionPeriod`	메시지 보관 기간	4일	60초-14일
`DelaySeconds`	기본 전달 지연 시간	0초	0-900초
`MaximumMessageSize`	최대 메시지 크기	256KB	1KB-256KB
`ReceiveMessageWaitTimeSeconds`	Long Polling 기본 대기 시간	0초	0-20초
`RedrivePolicy`	DLQ 설정 (JSON)	-	deadLetterTargetArn, maxReceiveCount
`FifoQueue`	FIFO 큐 여부	false	true/false (생성 후 변경 불가)
`ContentBasedDeduplication`	콘텐츠 기반 중복 제거 (FIFO)	false	true/false
`DeduplicationScope`	중복 제거 범위 (FIFO)	queue	queue/messageGroup
`FifoThroughputLimit`	FIFO 처리량 제한 (FIFO)	perQueue	perQueue/perMessageGroupId

AWS 공식 문서에서 "VisibilityTimeout has a range of 0 seconds to 12 hours with a default value of 30 seconds"라고 명시되어 있음.

5. SQS 활용 패턴

5.1 Work Queues (디커플링)

컴포넌트 간 디커플링을 위한 가장 기본적인 패턴으로, 비동기 워크플로우의 주요 사용 사례입니다 Schibsted 블로그 (opens in a new tab)

해당 블로그에서 "Asynchronous workflows have always been the primary use case for SQS, where using queues ensures one component can keep running smoothly without losing data when another component is unavailable or slow"라고 명시되어 있음.

사용 예시:

웹 서버와 백그라운드 작업자 분리
마이크로서비스 간 통신
이미지 처리, 비디오 인코딩 등

5.2 Buffer and Batch Operations

트래픽 버스트를 완화하고 배치 처리로 효율성을 높이는 패턴입니다

장점:

갑작스런 트래픽 증가 흡수
처리 속도 차이를 버퍼로 완화
배치 처리로 비용 절감

SQS는 최대 10개 메시지 또는 256KB까지 배치로 처리할 수 있으며, 배치는 단일 메시지와 동일한 비용이 청구됩니다 AWS SQS 모범 사례 (opens in a new tab)

5.3 Request-Response Pattern

임시 큐를 사용한 요청-응답 패턴으로, 가장 일반적인 사용 사례입니다 AWS 블로그 - Temporary Queue (opens in a new tab)

AWS 블로그에서 "The most common use case for temporary queues is the request-response messaging pattern, where a requester creates a temporary queue for receiving each response message"라고 명시되어 있음.

동작 방식:

요청자가 임시 응답 큐 생성
요청 메시지에 응답 큐 URL 포함
처리자가 작업 완료 후 응답 큐에 메시지 전송
요청자가 응답 수신 후 임시 큐 삭제

5.4 SNS + SQS Fanout

SNS(Simple Notification Service)란?

SNS는 AWS에서 제공하는 완전 관리형 Pub/Sub 메시징 서비스입니다 AWS 공식 문서 - SNS (opens in a new tab)

주요 특징:

발행-구독(Pub/Sub) 모델: 발행자가 토픽에 메시지를 발행하면, 구독자들이 메시지를 수신
다양한 프로토콜 지원: SMS, 이메일, HTTP/HTTPS, SQS, Lambda, 모바일 푸시 알림 등
Fan-out 패턴: 하나의 메시지를 여러 엔드포인트에 동시 전달
메시지 필터링: 구독자가 관심 있는 메시지만 수신 가능
고가용성: 여러 가용 영역에 분산

SNS + SQS 조합의 장점

하나의 메시지를 여러 큐로 동시에 배포하는 패턴입니다 AWS 블로그 - Resilient Patterns (opens in a new tab)

AWS 블로그에서 "Adding an SQS queue between the SNS topic and its subscriber adds resilience to message delivery since the messages are durably stored in a queue, and it throttles the rate of messages to the consumer, helping smooth out traffic bursts"라고 명시되어 있음.

장점:

메시지 내구성 향상
소비자별 독립적 처리 속도
트래픽 버스트 완화

5.5 Recursive Scaling Pattern

고도로 병렬화된 컴퓨팅을 위한 재귀적 스케일링 패턴입니다 AWS 아키텍처 블로그 (opens in a new tab)

AWS 블로그에서 "Each ECS task processes one named node at a time, recursively posts back more messages to the SQS queue for each child node, which prompts the automatic scaling mechanism to spin up more ECS tasks because the queue size has changed"라고 명시되어 있음.

사용 예시:

트리 구조 데이터 처리
그래프 탐색
대규모 병렬 작업 분산

6. 메시징 전략

6.1 Visibility Timeout

메시지를 수신한 후 다른 소비자가 해당 메시지를 볼 수 없는 시간입니다 AWS 공식 문서 (opens in a new tab)

Visibility Timeout과 Exactly-Once Processing의 관계

많은 사람들이 Visibility Timeout이 FIFO 큐의 exactly-once 처리를 보장한다고 오해하지만, 실제로는 서로 다른 메커니즘입니다 AWS re:Post - FIFO Queue Exactly-Once (opens in a new tab)

Exactly-Once Processing (FIFO 큐):

MessageDeduplicationId를 통해 5분 윈도우 내에서 중복 메시지를 자동으로 제거합니다
같은 중복 제거 ID를 가진 메시지는 큐에 한 번만 전달됩니다
이는 메시지 발행(publish) 단계에서의 중복을 방지합니다

Visibility Timeout:

메시지를 수신한 후 처리 중임을 표시하는 메커니즘입니다
다른 소비자가 동시에 같은 메시지를 처리하는 것을 방지합니다
Visibility timeout 내에 메시지를 삭제하지 않으면, 메시지가 다시 큐에 표시되어 재처리될 수 있습니다
이는 메시지 처리(consume) 단계에서의 동시 처리를 방지합니다

AWS re:Post에서 "SQS FIFO guarantees that you will get each message once, only if you process it and delete it during the visibility timeout. If you do not delete the message or extend the visibility timeout, the message will be available for other consumers"라고 명시되어 있음.

정리:

Exactly-Once = MessageDeduplicationId: 중복 메시지가 큐에 들어가는 것을 방지
Visibility Timeout: 메시지 처리 중 다른 소비자가 같은 메시지를 가져가는 것을 방지
둘의 조합: FIFO 큐에서 메시지가 정확히 한 번만 처리되도록 보장

모범 사례:

처리 시간에 맞춰 visibility timeout을 설정하되, 불확실한 경우 짧은 타임아웃(예: 2분)으로 시작하여 필요에 따라 연장합니다 AWS re:Post (opens in a new tab)

AWS re:Post에서 "Start by setting the visibility timeout to match the maximum time your application typically needs to process and delete a message. If you're unsure about the exact processing time, begin with a shorter timeout (for example, 2 minutes) and extend it as necessary"라고 명시되어 있음.

Lambda와 함께 사용 시:

Lambda 함수와 함께 사용할 때는 visibility timeout을 Lambda 함수의 실행 타임아웃보다 최소 6배 이상으로 설정하는 것이 권장됩니다

중요 제약사항:

최대 12시간 제한
타임아웃 연장은 12시간 제한을 리셋하지 않음
하트비트 메커니즘으로 주기적 연장 권장

6.2 Dead Letter Queue (DLQ)

처리 실패 메시지를 격리하여 디버깅하기 위한 큐입니다 AWS 공식 문서 - DLQ (opens in a new tab)

AWS 공식 문서에서 "Amazon SQS supports dead-letter queues (DLQs), which source queues can target for messages that are not processed successfully. DLQs are useful for debugging your application because you can isolate unconsumed messages to determine why processing did not succeed"라고 명시되어 있음.

DLQ 설정 방법

1. DLQ 큐 생성

먼저 DLQ로 사용할 큐를 생성해야 하며, 소스 큐와 동일한 AWS 계정 및 리전 내에 있어야 합니다 AWS 공식 문서 - Configure DLQ (opens in a new tab)

# AWS CLI로 DLQ 생성
aws sqs create-queue --queue-name MyDeadLetterQueue
 
# DLQ ARN 가져오기
aws sqs get-queue-attributes \
  --queue-url https://sqs.region.amazonaws.com/account-id/MyDeadLetterQueue \
  --attribute-names QueueArn

2. Redrive Policy 설정

Redrive Policy는 DLQ를 식별하고 메시지가 DLQ로 라우팅되기 전 개별 메시지의 최대 수신 횟수를 지정합니다 AWS SDK - DLQ Setup (opens in a new tab)

// TypeScript (AWS SDK v3)
import { SQSClient, SetQueueAttributesCommand } from '@aws-sdk/client-sqs';
 
const client = new SQSClient({ region: 'us-east-1' });
 
const redrivePolicy = {
  deadLetterTargetArn: 'arn:aws:sqs:us-east-1:123456789012:MyDeadLetterQueue',
  maxReceiveCount: '5'  // 5회 수신 후 DLQ로 이동
};
 
await client.send(new SetQueueAttributesCommand({
  QueueUrl: 'https://sqs.us-east-1.amazonaws.com/123456789012/MySourceQueue',
  Attributes: {
    RedrivePolicy: JSON.stringify(redrivePolicy)
  }
}));

3. DLQ에서 메시지 처리

DLQ에 도착한 메시지를 확인하고 재처리할 수 있습니다:

// DLQ에서 메시지 읽기
import { ReceiveMessageCommand } from '@aws-sdk/client-sqs';
 
const result = await client.send(new ReceiveMessageCommand({
  QueueUrl: dlqUrl,
  MaxNumberOfMessages: 10,
  AttributeNames: ['All'],
  MessageAttributeNames: ['All']
}));
 
// 실패 원인 분석
result.Messages?.forEach(message => {
  console.log('Failed message:', message.Body);
  console.log('Receive count:', message.Attributes?.ApproximateReceiveCount);
  console.log('First receive time:', message.Attributes?.ApproximateFirstReceiveTimestamp);
});

4. DLQ Redrive (재처리)

분석 후 문제를 해결했다면, DLQ의 메시지를 원본 큐로 다시 보낼 수 있습니다 AWS 공식 문서 - DLQ Redrive (opens in a new tab)

# AWS CLI로 DLQ redrive
aws sqs start-message-move-task \
  --source-arn arn:aws:sqs:us-east-1:123456789012:MyDeadLetterQueue \
  --destination-arn arn:aws:sqs:us-east-1:123456789012:MySourceQueue

구성 요소:

maxReceiveCount: 메시지가 DLQ로 이동하기 전 최대 수신 횟수 (권장: 3-10)
deadLetterTargetArn: DLQ의 ARN
DLQ 타입: 소스 큐와 동일한 타입이어야 함 (Standard ↔ Standard, FIFO ↔ FIFO)

권장 설정:

maxReceiveCount 값을 충분히 높게 설정하여 Amazon SQS 재시도를 허용하는 것이 모범 사례입니다 AWS re:Post (opens in a new tab)

AWS re:Post에서 "If the maxReceiveCount value in your redrive policy is set too low (e.g., 1 or 2), messages might be moved to the DLQ before Lambda has a chance to process them successfully. It's recommended to set this value higher to allow for multiple processing attempts"라고 명시되어 있음.

모니터링:

DLQ 크기 모니터링 (ApproximateNumberOfMessages 메트릭)
CloudWatch 알람 설정으로 에러율 추적
실패 메시지 분석 및 재처리
ApproximateAgeOfOldestMessage 메트릭으로 오래된 메시지 추적

6.3 메시지 중복 제거

FIFO 큐에서 5분 중복 제거 윈도우 내에 중복 메시지를 자동으로 제거합니다 AWS 공식 문서 - Deduplication (opens in a new tab)

AWS 공식 문서에서 "MessageDeduplicationId is a token used only in Amazon SQS FIFO queues to prevent duplicate message delivery, ensuring that within a 5-minute deduplication window, only one instance of a message with the same deduplication ID is processed and delivered"라고 명시되어 있음.

5분 중복 제거 윈도우란?

5분 중복 제거 윈도우는 같은 MessageDeduplicationId를 가진 메시지가 다시 발행될 때, 5분 이내에 들어온 중복 메시지만 제거한다는 의미입니다.

동작 방식:

최초 메시지 발행 (시각 T0): MessageDeduplicationId = "order-123" → 큐에 저장됨
중복 메시지 발행 (시각 T0 + 2분): MessageDeduplicationId = "order-123" → 중복으로 감지, 수락되지만 전달되지 않음
중복 메시지 발행 (시각 T0 + 4분): MessageDeduplicationId = "order-123" → 중복으로 감지, 수락되지만 전달되지 않음
중복 메시지 발행 (시각 T0 + 6분): MessageDeduplicationId = "order-123" → 5분 윈도우 초과, 새로운 메시지로 처리되어 큐에 저장됨

중요 사항:

5분 윈도우는 첫 번째 메시지가 큐에 도착한 시점부터 시작됩니다
5분이 지나면 같은 MessageDeduplicationId를 다시 사용할 수 있습니다
윈도우는 고정 값이며 변경할 수 없습니다

두 가지 방법:

Content-Based Deduplication (콘텐츠 기반)
- 메시지 본문의 SHA-256 해시 사용
- 자동으로 중복 제거 ID 생성
- 메시지 속성은 해시에 포함되지 않음

AWS 공식 문서에서 "AWS uses a SHA-256 hash to generate a message deduplication ID using the body of the message. This instructs Amazon SQS to use a SHA-256 hash to generate the message deduplication ID using the body of the message—but not the attributes of the message"라고 명시되어 있음.

Explicit Deduplication (명시적 중복 제거)
- 개발자가 고유한 중복 제거 ID 지정
- 더 세밀한 제어 가능

Standard Queue 대응:

Standard 큐를 사용하는 경우, 애플리케이션을 멱등성(idempotent)으로 설계해야 합니다 AWS SQS FAQ (opens in a new tab)

AWS FAQ에서 "If you use a standard queue, you must design your applications to be idempotent (that is, they must not be affected adversely when processing the same message more than once)"라고 명시되어 있음.

6.4 Polling 방식

Long Polling (권장)

모든 서버를 쿼리하여 최소 하나의 메시지가 도착할 때까지 대기합니다 (최대 20초) AWS 공식 문서 - Polling (opens in a new tab)

AWS 공식 문서에서 "Long polling queries all servers for messages, sending a response once at least one message is available, and an empty response is sent only if the polling wait time expires. The maximum long polling wait time is 20 seconds"라고 명시되어 있음.

Long Polling 설정 방법

Long Polling은 AWS에서 제공하는 옵션이며, 두 가지 방법으로 활성화할 수 있습니다:

큐 레벨 설정: ReceiveMessageWaitTimeSeconds 속성을 1-20초로 설정
요청 레벨 설정: ReceiveMessage API 호출 시 WaitTimeSeconds 파라미터 사용

NestJS 구현 예제

// sqs.module.ts
import { Module } from '@nestjs/common';
import { SqsService } from './sqs.service';
import { SqsProcessor } from './sqs.processor';
 
@Module({
  providers: [SqsService, SqsProcessor],
  exports: [SqsService],
})
export class SqsModule {}

// sqs.service.ts
import { Injectable, Logger, OnModuleInit, OnModuleDestroy } from '@nestjs/common';
import {
  SQSClient,
  ReceiveMessageCommand,
  DeleteMessageCommand,
  Message
} from '@aws-sdk/client-sqs';
import { ConfigService } from '@nestjs/config';
 
@Injectable()
export class SqsService implements OnModuleInit, OnModuleDestroy {
  private readonly logger = new Logger(SqsService.name);
  private readonly sqsClient: SQSClient;
  private readonly queueUrl: string;
  private isPolling = false;
 
  constructor(private configService: ConfigService) {
    this.sqsClient = new SQSClient({
      region: this.configService.get('AWS_REGION')
    });
    this.queueUrl = this.configService.get('SQS_QUEUE_URL');
  }
 
  async onModuleInit() {
    // 모듈 초기화 시 Long Polling 시작
    this.startPolling();
  }
 
  async onModuleDestroy() {
    // 모듈 종료 시 Polling 중지
    this.isPolling = false;
  }
 
  /**
   * Long Polling으로 메시지 수신
   */
  async receiveMessages(): Promise<Message[]> {
    try {
      const command = new ReceiveMessageCommand({
        QueueUrl: this.queueUrl,
        MaxNumberOfMessages: 10,           // 한 번에 최대 10개
        WaitTimeSeconds: 20,                // Long Polling: 20초 대기
        MessageAttributeNames: ['All'],     // 모든 메시지 속성 가져오기
        AttributeNames: ['All'],            // 모든 시스템 속성 가져오기
        VisibilityTimeout: 30,              // 30초 동안 다른 소비자에게 숨김
      });
 
      const response = await this.sqsClient.send(command);
      return response.Messages || [];
    } catch (error) {
      this.logger.error('메시지 수신 실패', error);
      throw error;
    }
  }
 
  /**
   * 메시지 삭제 (처리 완료 후 호출)
   */
  async deleteMessage(receiptHandle: string): Promise<void> {
    try {
      await this.sqsClient.send(new DeleteMessageCommand({
        QueueUrl: this.queueUrl,
        ReceiptHandle: receiptHandle,
      }));
      this.logger.debug('메시지 삭제 완료');
    } catch (error) {
      this.logger.error('메시지 삭제 실패', error);
      throw error;
    }
  }
 
  /**
   * 지속적인 Long Polling 루프
   */
  private async startPolling() {
    this.isPolling = true;
    this.logger.log('SQS Long Polling 시작');
 
    while (this.isPolling) {
      try {
        const messages = await this.receiveMessages();
 
        if (messages.length > 0) {
          this.logger.log(`${messages.length}개의 메시지 수신`);
 
          // 메시지 처리 (병렬 처리)
          await Promise.all(
            messages.map(message => this.processMessage(message))
          );
        }
      } catch (error) {
        this.logger.error('Polling 에러', error);
        // 에러 발생 시 5초 대기 후 재시도
        await this.sleep(5000);
      }
    }
 
    this.logger.log('SQS Long Polling 종료');
  }
 
  /**
   * 개별 메시지 처리
   */
  private async processMessage(message: Message): Promise<void> {
    try {
      this.logger.debug(`메시지 처리 시작: ${message.MessageId}`);
 
      // 메시지 본문 파싱
      const body = JSON.parse(message.Body || '{}');
 
      // 비즈니스 로직 처리
      await this.handleBusinessLogic(body);
 
      // 처리 성공 시 메시지 삭제
      await this.deleteMessage(message.ReceiptHandle!);
 
      this.logger.debug(`메시지 처리 완료: ${message.MessageId}`);
    } catch (error) {
      this.logger.error(`메시지 처리 실패: ${message.MessageId}`, error);
      // 메시지를 삭제하지 않으면 Visibility Timeout 후 재처리됨
    }
  }
 
  /**
   * 실제 비즈니스 로직
   */
  private async handleBusinessLogic(data: any): Promise<void> {
    // 여기에 실제 처리 로직 구현
    this.logger.log('비즈니스 로직 처리:', data);
  }
 
  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

// .env 설정
AWS_REGION=us-east-1
SQS_QUEUE_URL=https://sqs.us-east-1.amazonaws.com/123456789012/MyQueue

주요 설정 옵션:

WaitTimeSeconds: 20: Long Polling 활성화 (0이면 Short Polling)
MaxNumberOfMessages: 10: 한 번에 최대 10개 메시지 수신
VisibilityTimeout: 30: 처리 중 다른 소비자가 못 보도록 30초간 숨김
MessageAttributeNames: ['All']: 모든 사용자 정의 속성 가져오기

성능 비교:

Long polling 구현으로 short polling을 대체하면 빈 응답이 약 97% 감소하면서도 더 빠른 응답 시간을 가집니다 Medium - Fast Polling (opens in a new tab)

해당 블로그에서 "In practice, implementing long polling as a replacement of a short polling algorithm has reduced empty receives by approximately 97%, while having quicker response times and less polling threads"라고 명시되어 있음.

비용 비교:

Long Polling: 분당 최대 3회 빈 응답 (20초 대기 시)
Short Polling: 분당 최대 240회 빈 응답 (초당 4회 폴링 시)

Short Polling

서버의 일부만 쿼리하여 즉시 응답하며, 메시지가 없어도 바로 응답합니다

Long Polling과의 차이:

특성	Long Polling	Short Polling
`WaitTimeSeconds`	1-20	0 (기본값)
서버 쿼리 범위	모든 서버	서버의 일부
빈 응답 처리	메시지가 올 때까지 대기 (최대 20초)	즉시 빈 응답 반환
비용	낮음 (빈 응답 적음)	높음 (빈 응답 많음)
지연 시간	약간 높음 (최대 20초)	매우 낮음 (즉시)

NestJS Short Polling 예제

// Short Polling 예제 - Long Polling과 비교
@Injectable()
export class SqsShortPollingService {
  private readonly logger = new Logger(SqsShortPollingService.name);
  private readonly sqsClient: SQSClient;
  private readonly queueUrl: string;
 
  constructor(private configService: ConfigService) {
    this.sqsClient = new SQSClient({
      region: this.configService.get('AWS_REGION')
    });
    this.queueUrl = this.configService.get('SQS_QUEUE_URL');
  }
 
  /**
   * Short Polling으로 메시지 수신
   */
  async receiveMessagesShortPolling(): Promise<Message[]> {
    const command = new ReceiveMessageCommand({
      QueueUrl: this.queueUrl,
      MaxNumberOfMessages: 10,
      WaitTimeSeconds: 0,  // ⬅️ Short Polling (기본값)
      MessageAttributeNames: ['All'],
      AttributeNames: ['All'],
    });
 
    const response = await this.sqsClient.send(command);
    return response.Messages || [];
  }
 
  /**
   * Short Polling 루프 - 빈 응답이 많아 비효율적
   */
  async startShortPolling() {
    let emptyResponses = 0;
    let totalRequests = 0;
 
    while (true) {
      totalRequests++;
      const messages = await this.receiveMessagesShortPolling();
 
      if (messages.length === 0) {
        emptyResponses++;
        // 빈 응답이 많으므로 짧은 대기 후 재시도
        await this.sleep(250);  // 0.25초 대기
      } else {
        this.logger.log(`${messages.length}개 메시지 수신`);
        await Promise.all(messages.map(msg => this.processMessage(msg)));
      }
 
      // 통계 출력
      if (totalRequests % 100 === 0) {
        const emptyRate = (emptyResponses / totalRequests * 100).toFixed(2);
        this.logger.log(`빈 응답 비율: ${emptyRate}% (${emptyResponses}/${totalRequests})`);
      }
    }
  }
 
  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

성능 및 비용 비교:

// Long Polling (권장)
// - 분당 API 호출: 최대 3회
// - 빈 응답: 거의 없음 (메시지가 없을 때만)
// - 비용: 낮음
const longPollingCommand = new ReceiveMessageCommand({
  WaitTimeSeconds: 20,  // 20초 대기
  MaxNumberOfMessages: 10
});
 
// Short Polling (비권장)
// - 분당 API 호출: 최대 240회 (0.25초마다 1회)
// - 빈 응답: 매우 많음 (95%+)
// - 비용: 높음
const shortPollingCommand = new ReceiveMessageCommand({
  WaitTimeSeconds: 0,   // 즉시 응답
  MaxNumberOfMessages: 10
});

사용 시기:

거의 모든 경우에 Long Polling이 선호됨
Short Polling은 매우 낮은 지연이 필요한 특수한 경우에만 사용

AWS 공식 문서 - Best Practices (opens in a new tab)에서 "In almost all cases, Amazon SQS long polling is preferable to short polling"라고 명시되어 있음.

7. 보안 모범 사례

7.1 암호화

Server-Side Encryption (SSE)

데이터 유출 문제를 완화하려면 서버 측 암호화(SSE)를 사용하여 저장된 데이터를 암호화하며, Amazon SQS는 데이터를 저장할 때 메시지 수준에서 암호화하고 사용자가 액세스할 때 복호화합니다 AWS SQS 보안 모범 사례 (opens in a new tab)

AWS 공식 문서에서 "데이터 유출 문제를 완화하려면 서버 측 암호화(SSE)를 사용하여 저장된 데이터를 암호화하며, Amazon SQS는 데이터를 저장할 때 메시지 수준에서 암호화하고 사용자가 액세스할 때 복호화합니다"라고 명시되어 있음.

전송 보안

대기열 정책의 aws:SecureTransport 조건을 사용하여 HTTPS(TLS)를 통한 암호화된 연결만 허용하도록 SSL을 강제로 적용해야 합니다

7.2 액세스 제어

최소 권한 원칙

인터넷에 있는 모든 사람에게 명시적으로 요구하지 않는 한 대기열에 공개적으로 액세스할 수 없도록 해야 하며, Principal을 "*"로 설정하거나 와일드카드를 사용하지 말고 대신 특정 사용자 이름을 지정해야 합니다

IAM Roles 사용

IAM 역할을 사용하여 Amazon SQS에 액세스해야 하는 애플리케이션의 임시 자격 증명을 관리해야 하며, 역할을 사용하면 EC2 인스턴스나 AWS 서비스에 장기 자격 증명을 배포할 필요가 없습니다

7.3 네트워크 격리

VPC Endpoints

인터넷에 절대 노출되지 않아야 하는 대기열의 경우, VPC 엔드포인트를 사용하여 특정 VPC 내의 호스트에 대한 액세스만 허용해야 합니다

8. 선택 가이드

8.1 큐 타입 선택

순서 보장 필요?
├─ YES → FIFO Queue
│         └─ 처리량 \< 3,000/초?
│             ├─ YES → FIFO Queue 사용
│             └─ NO → 아키텍처 재검토 또는 파티셔닝
└─ NO → Standard Queue
          └─ 멱등성 구현 가능?
              ├─ YES → Standard Queue 사용
              └─ NO → FIFO Queue 고려

8.2 Message Group 사용 전략

단일 그룹 사용 시:

전체 큐에서 완전한 순서 보장
병렬 처리 불가
처리량 제한

다중 그룹 사용 시:

그룹별 순서 보장
그룹 간 병렬 처리
높은 처리량

그룹 ID 설계 예시:

고객 ID: customer-{id}
주문 타입: order-type-{type}
지역: region-{region}
테넌트: tenant-{tenant-id}

8.3 비용 최적화 체크리스트

✅ Long Polling 사용으로 빈 응답 최소화 ✅ Batch Operations로 API 호출 횟수 감소 ✅ Visibility Timeout 최적화로 불필요한 재처리 방지 ✅ DLQ로 실패 메시지 격리 및 분석 ✅ 필요한 경우에만 FIFO 사용 (비용 고려)

AWS SQS 완전 가이드: Message Group, 활용 패턴, 메시징 전략

결론

1. AWS SQS 개요

1.1 핵심 특징

1.2 메시지 처리 흐름

2. 큐 타입 비교: Standard vs FIFO

2.1 Standard Queue

2.2 FIFO Queue

2.3 비교표

3. Message Group 상세 설명

3.1 Message Group ID란?

3.2 동작 방식

3.3 메시지 수신 동작

3.4 사용 예시

3.5 MessageGroupId와 MessageDeduplicationId의 관계

4. SQS SDK 설정 파라미터

4.1 SendMessage API 파라미터

4.2 ReceiveMessage API 파라미터

4.3 큐 속성 (Queue Attributes)

5. SQS 활용 패턴

5.1 Work Queues (디커플링)

5.2 Buffer and Batch Operations

5.3 Request-Response Pattern

5.4 SNS + SQS Fanout

5.5 Recursive Scaling Pattern

6. 메시징 전략

6.1 Visibility Timeout

6.2 Dead Letter Queue (DLQ)

6.3 메시지 중복 제거

6.4 Polling 방식

Long Polling (권장)

Short Polling

7. 보안 모범 사례

7.1 암호화

7.2 액세스 제어

7.3 네트워크 격리

8. 선택 가이드

8.1 큐 타입 선택

8.2 Message Group 사용 전략

8.3 비용 최적화 체크리스트

참고 자료

AWS 공식 문서

AWS 블로그

커뮤니티 자료