March 14, 2023
I often grab data for a Vertex AI instance from BigQuery and shape it into a dataframe to do work.
Minimum viable snippet to get that to work:
import pandas as pd from google.cloud import bigquery client = bigquery.Client() query_string = "<some SQL query>" df = client.query(query_string).to_dataframe()
March 13, 2023
I often find it useful to stash .csv files in a Google Cloud Storage (GCS) bucket to be accessed from a Vertex JupyterLabs notebook.
Once the file(s) are accessible from Vertex, I can do all sorts of things with it.
One of the most common operations I perform in this scenario is to use the data in the .csv to create a Pandas dataframe that can in turn be used to enrich other data that I have stored in a BigQuery implementation.
...
February 25, 2023
If you’re a data scientist, there’s a good chance that you spend a healthy chunk of time working in Jupyter Notebooks.
Every now and then, you might do something that triggers a warning (such as ::ahem:: using a deprecated method from the Pandas package).
That warning can get end up consuming a whole lot of screen real estate, especially if it’s part of a looping function.
One way to deal with those warnings is to simply make them disappear by adding the following chunk in a cell near the top of the notebook:
...
September 27, 2022
Generic setup for supervised modeling in Python.
September 6, 2022
Running a random forest classifier in Python
September 2, 2022
Python model assessment for classifiers
September 2, 2022
Python test train split
August 17, 2022
That super common if name == ‘main’ conditional in Python scripts
August 14, 2022
Built-in Python function that handles creating a counter for loop iterations.
August 11, 2022
Interact with .csv files in Python