Programming

How to Use Python to Load the Content of a Google Document

  •  
  •  
  •  
  •  
  •  
  •  

A short walkthrough on how to use the Google Doc APIs in Python.

It’s possible that you have a Google Doc saved in Google Drive that you’d like to dynamically load into a Python pandas DataFrame for further processing.

To begin, log in to the Google Cloud Platform and build a new project called MyProject. Then, in the main dashboard, select the newly developed project and then explore and allow APIs. You will find Google Docs API by searching for it and then clicking Enable API. After you’ve allowed it, you’ll need to build new credentials.

As shown in the image below, you can select Google Doc API and look at the User Data.

You can complete the credential formation by following the wizard’s instructions. You can choose the Desktop app as the Application type in section 4 OAuth Client ID.

You can now save the created secret to your computer and rename it credentials.json. This file should be placed in the same directory as your Python code.

You can now download and copy the code from the official Google Doc API documentation, which allows you to download a Google Doc document’s material. Check to see if you have permission to read the Google Doc!!!

You can save the text to a local file and then convert it to a DataFrame for further review after you’ve extracted it. Each line of text will be stored as a new row in the DataFrame. Assume you’ve saved the text from the Google Doc into a variable called text.

doc_content = body.get(‘content’)
text = read_strucutural_elements(doc_content)

You can now save it as a local file:

with open(“my_doc.txt”, “w”) as text_file:
text_file.write(text)

The file can then be reloaded and converted to a DataFrame:
import pandas as pd
with open(‘my_doc.txt’, ‘r’) as f:
text = [line for line in f.readlines()]
df = pd.DataFrame(text,columns=[‘text’])

You’re finally ready to work with your data! Have fun doing it!

Summary
In this tutorial, I’ll show you how to load text from a Google Doc document and convert it to a Python Pandas DataFrame using the official Google Doc API.

The method is straightforward, but it does necessitate a few key steps, such as configuring the credentials used to access the Google Doc API.

Diginews.live is now on Telegram. Join Diginews channel in your Telegram and stay updated with latest news


  •  
  •  
  •  
  •  
  •  
  •