[Python] - ElementTree 개념 및 예시 코드 기초

ElementTree 개념 및 예시 코드 기초

ElementTree란

ElementTree는 Python 내장 라이브러리로, XML 데이터를 생성하고, 읽고, 수정하고, 삭제할 수 있도록 도와주는 모듈입니다. XML 데이터를 트리 형태로 표현하여 다룰 수 있으며, 문서 조작이 간편합니다.
- XML이란
  더보기
  XML(eXtensible Markup Language)은 데이터를 저장하고 구조화하는 태그 기반의 마크업 언어입니다.
  HTML과 유사하지만, 사용자가 직접 태그를 정의할 수 있다는 점이 다릅니다.

활용 예시

1. 라이브러리 임포트

import os
import xml.dom.minidom
import xml.etree.ElementTree as ET

os: 파일 존재 여부 확인 및 삭제에 이용합니다.
xml.dom.minidom: XML을 가독성 좋게 출력하기 위해 사용합니다.
xml.etree.ElementTree: XML 데이터를 생성하고 조작하는 핵심 라이브러리입니다.

3. 생성하려는 xml 파일 존재 유무 체크 후 삭제

save_path = r"C:\WorkSpace\python-basic\data\element_tree_basic_data.xml"
if os.path.exists(save_path):
    os.remove(save_path)
    print(f"기존 XML 파일 삭제: {save_path}")

os.path.exists(save_path): 해당 경로에 파일이 있는지 확인합니다.
os.remove(save_path): 기존 XML 파일을 삭제합니다.

4. XML 생성 및 저장

library = ET.Element("library")  # <library> 루트 요소 생성
self.add_book(library, "101", "Python Programming", "John Doe", "Programming", 2021, True)
self.add_book(library, "102", "Machine Learning Basics", "Jane Smith", "AI", 2020, False)
self.add_book(library, "103", "Data Science Handbook", "Emily White", "Data Science", 2019, True)

ET.Element("library"): <library> 태그를 생성하여 루트 요소로 사용합니다.
self.add_book(...): 책 정보를 추가하는 함수를 호출합니다.

add_book 함수

def add_book(self, library, book_id, title, author, genre, published, available):
    book = ET.SubElement(library, "book", id=book_id)
    ET.SubElement(book, "title").text = title
    ET.SubElement(book, "author").text = author
    ET.SubElement(book, "genre").text = genre
    ET.SubElement(book, "published").text = str(published)
    ET.SubElement(book, "available").text = str(available).lower()

ET.SubElement(library, "book", id=book_id): <book> 태그 생성 및 id 속성을 추가합니다.
ET.SubElement(book, "title").text = title: <title> 태그 생성 후 텍스트를 추가합니다.
available 값을 str(available).lower()로 변환하여 True/False를 true/false로 저장합ㄴ디ㅏ.

5. XML 파일 저장

save_path = os.path.join("C:\WorkSpace\python-basic\data", "element_tree_basic_data.xml")
os.makedirs(os.path.dirname(save_path), exist_ok=True)

# XML 파일 저장
ET.ElementTree(library).write(save_path, encoding="utf-8", xml_declaration=True)

os.makedirs(os.path.dirname(save_path), exist_ok=True): 저장할 디렉토리가 없으면 생성합니다.
ET.ElementTree(library).write(...): XML 데이터를 파일로 저장합니다.

16. XML 파일 읽고 모든 원소 정보 출력

tree = ET.parse(file_path)
root = tree.getroot()
for book in root.findall("book"):
    book_info = self.print_element(book)
    print(book_info)

ET.parse(file_path): XML 파일을 읽고 tree 객체 생성합니다.
tree.getroot(): 루트 요소를 가져옵니다.
root.findall("book"): 모든 <book> 요소를 찾습니다.

print_element 함수

def print_element(self, book):
    book_id = book.get("id")
    title = book.find("title").text
    author = book.find("author").text
    genre = book.find("genre").text
    published = book.find("published").text
    available = book.find("available").text

    book_info = (f"id: {book_id}, "
                 f"title: {title}, "
                 f"author: {author}, "
                 f"genre: {genre}, "
                 f"published: {published}, "
                 f"available: {available}")

    return book_info

book.get("id"): id 속성을 가져옵니다.
book.find("title").text: 특정 태그의 텍스트 값을 가져옵니다.

7. XML 데이터 특정 조건의 정보 출력

for book in root.findall("book"):
    if book.find("genre").text == "AI":
        print(f"AI 관련 도서: {book.find('title').text} by {book.find('author').text}")

book.find("genre").text == "AI": <genre> 태그 값이 AI인 책 출력

8. XML 데이터 특정 정보 수정

for book in root.findall("book"):
    if book.get("id") == "103":
        book.find("published").text = "1995"

9. XML 데이터 정보 추가

new_book = ET.SubElement(root, "book", id="104")
ET.SubElement(new_book, "title").text = "Deep Learning Guide"
ET.SubElement(new_book, "author").text = "Michael Brown"
ET.SubElement(new_book, "genre").text = "AI"
ET.SubElement(new_book, "published").text = "2023"
ET.SubElement(new_book, "available").text = "true"

ET.SubElement(root, "book", id="104"): <library> 요소 안에 새로운 <book> 요소를 추가합니다.
새로운 책 정보를 추가한 후 XML 파일을 다시 저장합니다.

10. XML 데이터 특정 정보 삭제

for book in root.findall("book"):
    if book.find("available").text == "false":
        root.remove(book)

<available> 요소의 값이 false인 책을 찾아 삭제합니다.

11. XML 데이터 XPath 활용하여 찾기

book = root.find(".//book[@id='101']")
if book is not None:
    print(f"ID 101 도서 제목: {book.find('title').text}")

root.find(".//book[@id='101']"): XPath를 사용해 i가 101인 책 검색

12. XML Pretty Print

xml_str = ET.tostring(root, encoding="utf-8")
pretty_xml = xml.dom.minidom.parseString(xml_str).toprettyxml(indent="  ")
print(pretty_xml)

ET.tostring(root, encoding="utf-8"): XML 데이터를 문자열로 변환합니다.
xml.dom.minidom.parseString(xml_str).toprettyxml(indent=" "): XML을 들여쓰기로 정리하여 출력합니다.

전체 코드

ElementTree 사용 예시

import os
import xml.dom.minidom
import xml.etree.ElementTree as ET


class ElementTreeBasic:
    def element_tree_basic(self):
        print("Element Tree Basic")

        save_path = r"C:\WorkSpace\python-basic\data\element_tree_basic_data.xml"

        # 기존 XML 파일이 존재하면 삭제
        if os.path.exists(save_path):
            os.remove(save_path)
            print(f"기존 XML 파일 삭제: {save_path}")

        print("\n-----Element Tree 생성 및 저장-----")
        # 루트 요소 생성
        library = ET.Element("library")  # <library> 태그 생성
        # 책 데이터 추가
        self.add_book(library, "101", "Python Programming", "John Doe", "Programming", 2021, True)
        self.add_book(library, "102", "Machine Learning Basics", "Jane Smith", "AI", 2020, False)
        self.add_book(library, "103", "Data Science Handbook", "Emily White", "Data Science", 2019, True)

        # 저장할 폴더 설정 (C:\WorkSpace\python-basic\data\library.xml)
        save_path = os.path.join("C:\WorkSpace\python-basic\data", "element_tree_basic_data.xml")

        # 폴더가 없으면 생성
        os.makedirs(os.path.dirname(save_path), exist_ok=True)

        # XML 파일 저장
        tree = ET.ElementTree(library)
        tree.write(save_path, encoding="utf-8", xml_declaration=True)

        print(f"XML 파일이 저장된 위치: {os.path.abspath(save_path)}")

        print("\n-----xml 파일 읽기 및 모든 책 정보 출력-----")
        # 파일 경로 지정
        file_path = r"C:\WorkSpace\python-basic\data\element_tree_basic_data.xml"  # raw string 사용

        # XML 파일 파싱
        tree = ET.parse(file_path)  # 파일을 파싱하여 트리 객체 생성
        root = tree.getroot()  # 루트 요소 가져오기

        # 모든 책 정보 출력
        for book in root.findall("book"):
            book_info = self.print_element(book)

            print(book_info)

        print("\n-----xml 특정 조건의 책 찾기 (AI 장르 책 검색)-----")
        for book in root.findall("book"):
            if book.find("genre").text == "AI":
                print(f"AI 관련 도서: {book.find('title').text} by {book.find('author').text}")

        print("\n-----xml 정보 수정 (출판 연도 업데이트)-----")
        print("수정 전")
        for book in root.findall("book"):
            book_info = self.print_element(book)
            print(book_info)

        for book in root.findall("book"):
            if book.get("id") == "103":
                book.find("published").text = "1995"

        print("\n수정 후")
        for book in root.findall("book"):
            book_info = self.print_element(book)
            print(book_info)

        print("\n-----xml 정보 추가 (새로운 책 추가)-----")
        file_path = r"C:\WorkSpace\python-basic\data\element_tree_basic_data.xml"

        tree = ET.parse(file_path)
        root = tree.getroot()

        new_book = ET.SubElement(root, "book", id="104")
        ET.SubElement(new_book, "title").text = "Deep Learning Guide"
        ET.SubElement(new_book, "author").text = "Michael Brown"
        ET.SubElement(new_book, "genre").text = "AI"
        ET.SubElement(new_book, "published").text = "2023"
        ET.SubElement(new_book, "available").text = "true"

        tree.write(file_path, encoding="utf-8", xml_declaration=True)

        root = tree.getroot()
        for book in root.findall("book"):
            book_info = self.print_element(book)
            print(book_info)

        print("\n-----xml 정보 삭제 (대출 불가능한 책 제거)-----")
        for book in root.findall("book"):
            if book.find("available").text == "false":
                root.remove(book)

        tree.write(file_path, encoding="utf-8", xml_declaration=True)

        for book in root.findall("book"):
            book_info = self.print_element(book)
            print(book_info)

        print("\n-----XPath를 활용하여 xml 특정 요소 찾기-----")
        book = root.find(".//book[@id='101']")
        if book is not None:
            print(f"ID 101 도서 제목: {book.find('title').text}")

        print("\n-----XML Pretty Print (XML 데이터 가독성 좋게 출력)-----")
        xml_str = ET.tostring(root, encoding="utf-8")
        pretty_xml = xml.dom.minidom.parseString(xml_str).toprettyxml(indent="  ")
        print(pretty_xml)

    def add_book(self, library, book_id, title, author, genre, published, available):
        # 책을 추가하는 함수 정의
        book = ET.SubElement(library, "book", id=book_id) # <book> 태그 생성 (id 속성 추가)
        ET.SubElement(book, "title").text = title # <title> 태그 생성 및 텍스트 추가
        ET.SubElement(book, "author").text = author # <author> 태그 생성 및 텍스트 추가
        ET.SubElement(book, "genre").text = genre # <genre> 태그 생성 및 텍스트 추가
        ET.SubElement(book, "published").text = str(published) # <published> 태그 생성 및 텍스트 추가
        ET.SubElement(book, "available").text = str(available).lower() # <available> 태그 생성 및 (소문자로 변환)

    def print_element(self, book):
        book_id = book.get("id")
        title = book.find("title").text
        author = book.find("author").text
        genre = book.find("genre").text
        published = book.find("published").text
        available = book.find("available").text

        book_info = (f"id: {book_id}, "
                     f"title: {title}, "
                     f"author: {author}, "
                     f"genre: {genre}, "
                     f"published: {published}, "
                     f"available: {available}")

        return book_info

if __name__ == '__main__':
    element_tree_basic = ElementTreeBasic()
    element_tree_basic.element_tree_basic()
    
'''
출력
Element Tree Basic
기존 XML 파일 삭제: C:\WorkSpace\python-basic\data\element_tree_basic_data.xml

-----Element Tree 생성 및 저장-----
XML 파일이 저장된 위치: C:\WorkSpace\python-basic\data\element_tree_basic_data.xml

-----xml 파일 읽기 및 모든 책 정보 출력-----
id: 101, title: Python Programming, author: John Doe, genre: Programming, published: 2021, available: true
id: 102, title: Machine Learning Basics, author: Jane Smith, genre: AI, published: 2020, available: false
id: 103, title: Data Science Handbook, author: Emily White, genre: Data Science, published: 2019, available: true

-----xml 특정 조건의 책 찾기 (AI 장르 책 검색)-----
AI 관련 도서: Machine Learning Basics by Jane Smith

-----xml 정보 수정 (출판 연도 업데이트)-----
수정 전
id: 101, title: Python Programming, author: John Doe, genre: Programming, published: 2021, available: true
id: 102, title: Machine Learning Basics, author: Jane Smith, genre: AI, published: 2020, available: false
id: 103, title: Data Science Handbook, author: Emily White, genre: Data Science, published: 2019, available: true

수정 후
id: 101, title: Python Programming, author: John Doe, genre: Programming, published: 2021, available: true
id: 102, title: Machine Learning Basics, author: Jane Smith, genre: AI, published: 2020, available: false
id: 103, title: Data Science Handbook, author: Emily White, genre: Data Science, published: 1995, available: true

-----xml 정보 추가 (새로운 책 추가)-----
id: 101, title: Python Programming, author: John Doe, genre: Programming, published: 2021, available: true
id: 102, title: Machine Learning Basics, author: Jane Smith, genre: AI, published: 2020, available: false
id: 103, title: Data Science Handbook, author: Emily White, genre: Data Science, published: 2019, available: true
id: 104, title: Deep Learning Guide, author: Michael Brown, genre: AI, published: 2023, available: true

-----xml 정보 삭제 (대출 불가능한 책 제거)-----
id: 101, title: Python Programming, author: John Doe, genre: Programming, published: 2021, available: true
id: 103, title: Data Science Handbook, author: Emily White, genre: Data Science, published: 2019, available: true
id: 104, title: Deep Learning Guide, author: Michael Brown, genre: AI, published: 2023, available: true

-----XPath를 활용하여 xml 특정 요소 찾기-----
ID 101 도서 제목: Python Programming

-----XML Pretty Print (XML 데이터 가독성 좋게 출력)-----
<?xml version="1.0" ?>
<library>
  <book id="101">
    <title>Python Programming</title>
    <author>John Doe</author>
    <genre>Programming</genre>
    <published>2021</published>
    <available>true</available>
  </book>
  <book id="103">
    <title>Data Science Handbook</title>
    <author>Emily White</author>
    <genre>Data Science</genre>
    <published>2019</published>
    <available>true</available>
  </book>
  <book id="104">
    <title>Deep Learning Guide</title>
    <author>Michael Brown</author>
    <genre>AI</genre>
    <published>2023</published>
    <available>true</available>
  </book>
</library>
'''

'Python' 카테고리의 다른 글

[Python] - xml to json / json to xml 변환하기 (0)	2025.01.17
[Python] - for문 사용법 및 예제 코드 (1)	2025.01.09
[Python] - DataFrame 개념 및 예제 코드 (0)	2024.10.22
[Python] - 파이참 프로젝트 디렉토리 사라짐 현상 (0)	2024.10.21
[Python] - TypeError: 'set' object is not subscriptable (0)	2024.10.20

기술블로그

[Python] - ElementTree 개념 및 예시 코드 기초

ElementTree 개념 및 예시 코드 기초

ElementTree란

활용 예시

전체 코드

'Python' 카테고리의 다른 글

티스토리툴바

[Python] - ElementTree 개념 및 예시 코드 기초

ElementTree 개념 및 예시 코드 기초

ElementTree란

활용 예시

전체 코드

'Python' 카테고리의 다른 글

관련글

티스토리툴바