Lesson

3D scatter plots

Learn 3D scatter plots in SQLPad's Data Science in Action course with practical examples and guided lessons.

Introduction

In this lesson, we will explore 3D scatter plots using the powerful interactive visualization library - Plotly. 3D scatter plots allow us to visualize three numerical variables in a single chart, providing valuable insights into the relationships and patterns within the data. We will learn how to create and customize 3D scatter plots using Plotly and Pandas, and also how to interpret these plots to extract meaningful insights from the data.

Loading data and basic setup

In this code example, we will create a 3D scatter plot using the built-in dataset from Plotly. We will first load the data and set up the basic DataFrame.

import plotly.express as px
import pandas as pd

# Load built-in dataset
df = px.data.iris()

# Display the first few rows of the DataFrame
print(df.head())

After loading the data and setting up the DataFrame, we will proceed to construct the 3D scatter plot using Plotly.

# Create a 3D scatter plot
fig = px.scatter_3d(df, x='sepal_width', y='sepal_length', z='petal_length', 
                    color='species', symbol='species', 
                    labels={'species': 'Iris species'})

# Show the plot
fig.show()

Creating a simple 3D scatter plot

In this code example, we will be creating a simple 3D scatter plot using Plotly and a built-in dataset from Plotly.

Code Block 1: Importing libraries and preparing the dataset

First, let's import the necessary libraries and load the built-in iris dataset from Plotly. We'll also display the first few rows of the dataset using df.head().

import plotly.express as px
import pandas as pd

# Load the iris dataset
df = px.data.iris()

# Display the first few rows
df.head()

Code Block 2: Creating the 3D scatter plot

Now, let's create the 3D scatter plot using the iris dataset. We will plot the sepal width, sepal length, and petal length on the x, y, and z axes respectively, and color the points by species.

# Create the 3D scatter plot
fig = px.scatter_3d(df, x='sepal_width', y='sepal_length', z='petal_length', color='species')

# Display the plot
fig.show()

Customizing the appearance of the 3D scatter plot

In this code example, we will customize the appearance of a 3D scatter plot using the built-in dataset from Plotly. We will divide the code into two blocks, one for creating the Pandas dataframe and another for constructing the Plotly chart.

Code Block 1: Creating the Pandas Dataframe

import pandas as pd
import plotly.express as px

# Load built-in dataset
df = px.data.iris()

# Display the first 5 rows of the dataframe
print(df.head())

Code Block 2: Constructing the Plotly Chart

# Create a 3D scatter plot with custom appearance
fig = px.scatter_3d(df, x='sepal_width', y='sepal_length', z='petal_length',
                    color='species', symbol='species',
                    size='petal_width', size_max=15,
                    opacity=0.7, template='plotly_dark',
                    title='Iris Dataset: Sepal Width vs Sepal Length vs Petal Length')

# Customize the appearance of the markers
fig.update_traces(marker=dict(line=dict(width=1, color='rgba(0, 0, 0, 0.2)')))

# Show the plot
fig.show()

Adding hover information to the 3D scatter plot

In this code example, we will be adding hover information to a 3D scatter plot using Plotly and a built-in dataset from the Plotly library.

First, we will import the necessary libraries and load the data:

import plotly.express as px
import pandas as pd

# Load built-in dataset from Plotly
df = px.data.gapminder()

# Filter the dataset for the year 2007
df = df[df['year']==2007]

# Display the first few rows of the dataset
print(df.head())

Now, let's create the 3D scatter plot with hover information:

# Create the 3D scatter plot
fig = px.scatter_3d(df, x='gdpPercap', y='lifeExp', z='pop',
                    color='continent', size='pop', size_max=60,
                    hover_name='country', log_x=True, log_z=True)

# Set the title and axis labels
fig.update_layout(scene=dict(xaxis_title='GDP per Capita',
                              yaxis_title='Life Expectancy',
                              zaxis_title='Population'))

# Show the plot
fig.show()

Animating the 3D scatter plot over time

In this code example, we will create an animation of a 3D scatter plot over time using Plotly and Pandas built-in datasets.

Code Block 1: Preparing the data

First, we will import the required libraries and load the built-in Gapminder dataset. After that, we will prepare the data by selecting the required columns and displaying the first few rows of the resulting dataframe.

import plotly.express as px
import pandas as pd

# Load the Gapminder dataset
gapminder_dataset = px.data.gapminder()

# Select the required columns
columns = ['year', 'continent', 'country', 'pop', 'gdpPercap', 'lifeExp']
df = gapminder_dataset[columns]

# Display the first few rows of the dataframe
df.head()

Code Block 2: Creating the animated 3D scatter plot

Now, we will create the animated 3D scatter plot using the prepared data. We will set the x-axis to 'gdpPercap', the y-axis to 'lifeExp', and the z-axis to 'pop'. The animation will be based on the 'year' column.

# Create the animated 3D scatter plot
fig = px.scatter_3d(df, x='gdpPercap', y='lifeExp', z='pop',
                    color='continent', symbol='continent', 
                    animation_frame='year', size='pop',
                    text='country', log_x=True, log_z=True,
                    range_x=[100, 100000], range_z=[100000, 1000000000],
                    labels={'gdpPercap': 'GDP per Capita',
                            'lifeExp': 'Life Expectancy',
                            'pop': 'Population'})

# Show the plot
fig.show()

By executing these code blocks, you will be able to create an animated 3D scatter plot that visualizes the relationship between GDP per capita, life expectancy, and population over time.

Using color scales to represent a fourth dimension

In this code example, we will be using color scales to represent a fourth dimension in a 3D scatter plot. We will use the built-in dataset iris from the plotly library.

First, let's create our dataframe and display its first few rows.

import plotly.express as px

# Load iris dataset
df = px.data.iris()

# Display first few rows of the dataset
print(df.head())

Now, let's create a 3D scatter plot using the plotly.graph_objects module. We will use the columns sepal_width, sepal_length, and petal_length as our x, y, and z axes, respectively. Additionally, we will use the species_id column to represent the fourth dimension as a color scale.

import plotly.graph_objects as go

# Create a 3D scatter plot
fig = go.Figure(data=[go.Scatter3d(
    x=df['sepal_width'],
    y=df['sepal_length'],
    z=df['petal_length'],
    mode='markers',
    marker=dict(
        size=6,
        color=df['species_id'], # set color scale to species_id
        colorscale='Viridis',   # choose a colorscale
        opacity=0.8
    )
)])

# Set axis labels
fig.update_layout(scene=dict(
    xaxis_title='Sepal Width',
    yaxis_title='Sepal Length',
    zaxis_title='Petal Length')
)

# Show the plot
fig.show()

Exercises

1. Creating a 3D Scatter Plot with Plotly

Instruction

In this exercise, you will create a 3D scatter plot using the iris dataset. You will use the sepal length, sepal width, and petal length as your three dimensions (x, y, and z axes), and color the points based on the species of the iris flowers. Additionally, you will customize the marker size, marker symbol, and axis labels.

Follow these steps:

  1. Import the necessary libraries and load the built-in iris dataset from Plotly.
  2. Create a basic 3D scatter plot using the iris dataset.
  3. Change the marker size and symbol by using the size and symbol parameters in the px.scatter_3d function.
  4. Customize the axis labels using the update_xaxes, update_yaxes, and update_zaxes functions.
  5. Show the plot using the fig.show() function.

My Solution

# Your solution goes here

Hint

Remember to use the px.scatter_3d function to create the 3D scatter plot, and the update_xaxes, update_yaxes, and update_zaxes functions to customize the axis labels. Use the size and symbol parameters to change the marker size and symbol.

Solution

import plotly.express as px

# Load the iris dataset
data = px.data.iris()

# Create a 3D scatter plot with custom marker size and symbol
fig = px.scatter_3d(data, x='sepal_width', y='sepal_length', z='petal_length', color='species', size='petal_width', symbol='species')


# Customize axis labels
fig.update_layout(
    title='something', 
    autosize=False,

    scene=dict(
        xaxis_title='X Axis Title',
        yaxis_title='Y Axis Title',
        zaxis_title='Z Axis Title',
    ),
)
# Show the plot
fig.show()