Skip to main content

Working with CSV, JSON, and XML Data in Python

Introduction

In this tutorial, we will be covering how to work with CSV, JSON, and XML data in Python. These are common data formats that you will encounter when working with data in Python. By the end of this guide, you should be able to read, write, and manipulate data in these formats.

Working with CSV Data

A CSV (Comma Separated Values) file is a simple file format used to store tabular data, such as a spreadsheet or a database. CSV files can be easily imported and exported by spreadsheets and databases, including but not limited to Microsoft Excel, Open Office Calc, and Google Spreadsheets.

Python provides a built-in module called csv to read and write to CSV files.

Reading CSV files

import csv

with open('example.csv', 'r') as file:
reader = csv.reader(file)
for row in reader:
print(row)

Writing to CSV files

import csv

with open('example.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(["column1", "column2", "column3"])
writer.writerow(["data1", "data2", "data3"])

Working with JSON Data

JSON (JavaScript Object Notation) is a lightweight data-interchange format that is easy for humans to read and write. It is easy for machines to parse and generate. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others.

Python provides a json module to work with JSON data.

Reading JSON Data

import json

with open('example.json', 'r') as file:
data = json.load(file)
print(data)

Writing JSON Data

import json

data = {
"name": "John",
"age": 30,
"city": "New York"
}

with open('example.json', 'w') as file:
json.dump(data, file)

Working with XML Data

XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The design goals of XML emphasize simplicity, generality, and usability across the Internet.

Python provides a module xml.etree.ElementTree to work with XML data.

Reading XML Data

import xml.etree.ElementTree as ET

tree = ET.parse('example.xml')
root = tree.getroot()

for child in root:
print(child.tag, child.attrib)

Writing XML Data

import xml.etree.ElementTree as ET

root = ET.Element("root")
doc = ET.SubElement(root, "doc")

ET.SubElement(doc, "field1", name="blah").text = "some value1"
ET.SubElement(doc, "field2", name="asdfasd").text = "some vlaue2"

tree = ET.ElementTree(root)
tree.write("filename.xml")

Conclusion

In this tutorial, we've learned how to read, write, and manipulate data in CSV, JSON, and XML formats using Python. These are fundamental skills you'll need when handling data in Python. Practice these techniques with different datasets to get a better grasp of them. Remember, the key to mastering Python (or any programming language) is consistent practice and experimentation. Happy coding!