whatstk.graph package

Subpackages

Submodules

whatstk.graph.base module

Build plotly-compatible figures.

Classes:

FigureBuilder([df, chat])

Generate a variety of figures from your loaded chat.

class whatstk.graph.base.FigureBuilder(df: DataFrame | None = None, chat: BaseChat | None = None)[source]

Bases: object

Generate a variety of figures from your loaded chat.

Integrates feature extraction and visualization logic to automate data plots.

Note: Either df or chat must be provided.

Parameters:
  • df (pandas.DataFrame, optional) – Chat data. Atribute df of a chat loaded using Chat. If a value is given, chat is ignored.

  • chat (Chat, optional) – Chat data. Object obtained when chat loaded using Chat. Required if df is None.

Attributes:

user_color_mapping

Get mapping between user and color.

usernames

Get list with users available in given chat.

Methods:

user_interventions_count_linechart([...])

Plot number of user interventions over time.

user_message_responses_flow([title])

Get the flow of message responses.

user_message_responses_heatmap([norm, title])

Get the response matrix heatmap.

user_msg_length_boxplot([title, xlabel])

Generate figure with boxplots of each user's message length.

property user_color_mapping: Dict[str, str]

Get mapping between user and color.

Each user is assigned a color automatically, so that this color is preserved for that user in all to-be-generated plots.

Returns:

dict – Mapping from username to color (rgb).

user_interventions_count_linechart(date_mode: str = 'date', msg_length: bool = False, cumulative: bool = False, all_users: bool = False, title: str = 'User interventions count', xlabel: str = 'Date/Time') Figure[source]

Plot number of user interventions over time.

Parameters:
  • date_mode (str, optional) –

    Choose mode to group interventions by. Defaults to 'date'. Available modes are:

    • 'date': Grouped by particular date (year, month and day).

    • 'hour': Grouped by hours.

    • 'month': Grouped by months.

    • 'weekday': Grouped by weekday (i.e. monday, tuesday, …, sunday).

    • 'hourweekday': Grouped by weekday and hour.

  • msg_length (bool, optional) – Set to True to count the number of characters instead of number of messages sent.

  • cumulative (bool, optional) – Set to True to obtain commulative counts.

  • all_users (bool, optional) – Obtain number of interventions of all users combined. Defaults to False.

  • title (str, optional) – Title for plot. Defaults to “User interventions count”.

  • xlabel (str, optional) – x-axis label title. Defaults to “Date/Time”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_interventions_count_linechart(cumulative=True)
>>> plot(fig)
user_message_responses_flow(title: str = 'Message flow') Figure[source]

Get the flow of message responses.

A response from user X to user Y happens if user X sends a message right after a message from user Y.

Uses a Sankey diagram.

Parameters:

title (str, optional) – Title for plot. Defaults to “Message flow”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_message_responses_flow()
>>> plot(fig)
user_message_responses_heatmap(norm: str = 'absolute', title: str = 'Response matrix') Figure[source]

Get the response matrix heatmap.

A response from user X to user Y happens if user X sends a message right after a message from user Y.

Parameters:
  • norm (str, optional) –

    Specifies the type of normalization used for reponse count. Can be:

    • 'absolute': Absolute count of messages.

    • 'joint': Normalized by total number of messages sent by all users.

    • 'sender': Normalized per sender by total number of messages sent by user.

    • 'receiver': Normalized per receiver by total number of messages sent by user.

  • title (str, optional) – Title for plot. Defaults to “Response matrix”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_message_responses_heatmap()
>>> plot(fig)
user_msg_length_boxplot(title: str = 'User message length', xlabel: str = 'User') Figure[source]

Generate figure with boxplots of each user’s message length.

Parameters:
  • title (str, optional) – Title for plot. Defaults to “User message length”.

  • xlabel (str, optional) – x-axis label title. Defaults to “User”.

Returns:

dict – Dictionary with data and layout. Plotly compatible.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_msg_length_boxplot()
>>> plot(fig)
property usernames: BaseChat

Get list with users available in given chat.

Returns:

list – List with usernames available in chat DataFrame.

Module contents

Plot tools using plotly.

Import plot to plot figures.

Classes:

FigureBuilder([df, chat])

Generate a variety of figures from your loaded chat.

Functions:

plot(figure_or_data[, show_link, link_text, ...])

Create a plotly graph locally as an HTML document or string.

class whatstk.graph.FigureBuilder(df: DataFrame | None = None, chat: BaseChat | None = None)[source]

Bases: object

Generate a variety of figures from your loaded chat.

Integrates feature extraction and visualization logic to automate data plots.

Note: Either df or chat must be provided.

Parameters:
  • df (pandas.DataFrame, optional) – Chat data. Atribute df of a chat loaded using Chat. If a value is given, chat is ignored.

  • chat (Chat, optional) – Chat data. Object obtained when chat loaded using Chat. Required if df is None.

Attributes:

user_color_mapping

Get mapping between user and color.

usernames

Get list with users available in given chat.

Methods:

user_interventions_count_linechart([...])

Plot number of user interventions over time.

user_message_responses_flow([title])

Get the flow of message responses.

user_message_responses_heatmap([norm, title])

Get the response matrix heatmap.

user_msg_length_boxplot([title, xlabel])

Generate figure with boxplots of each user's message length.

property user_color_mapping: Dict[str, str]

Get mapping between user and color.

Each user is assigned a color automatically, so that this color is preserved for that user in all to-be-generated plots.

Returns:

dict – Mapping from username to color (rgb).

user_interventions_count_linechart(date_mode: str = 'date', msg_length: bool = False, cumulative: bool = False, all_users: bool = False, title: str = 'User interventions count', xlabel: str = 'Date/Time') Figure[source]

Plot number of user interventions over time.

Parameters:
  • date_mode (str, optional) –

    Choose mode to group interventions by. Defaults to 'date'. Available modes are:

    • 'date': Grouped by particular date (year, month and day).

    • 'hour': Grouped by hours.

    • 'month': Grouped by months.

    • 'weekday': Grouped by weekday (i.e. monday, tuesday, …, sunday).

    • 'hourweekday': Grouped by weekday and hour.

  • msg_length (bool, optional) – Set to True to count the number of characters instead of number of messages sent.

  • cumulative (bool, optional) – Set to True to obtain commulative counts.

  • all_users (bool, optional) – Obtain number of interventions of all users combined. Defaults to False.

  • title (str, optional) – Title for plot. Defaults to “User interventions count”.

  • xlabel (str, optional) – x-axis label title. Defaults to “Date/Time”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_interventions_count_linechart(cumulative=True)
>>> plot(fig)
user_message_responses_flow(title: str = 'Message flow') Figure[source]

Get the flow of message responses.

A response from user X to user Y happens if user X sends a message right after a message from user Y.

Uses a Sankey diagram.

Parameters:

title (str, optional) – Title for plot. Defaults to “Message flow”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_message_responses_flow()
>>> plot(fig)
user_message_responses_heatmap(norm: str = 'absolute', title: str = 'Response matrix') Figure[source]

Get the response matrix heatmap.

A response from user X to user Y happens if user X sends a message right after a message from user Y.

Parameters:
  • norm (str, optional) –

    Specifies the type of normalization used for reponse count. Can be:

    • 'absolute': Absolute count of messages.

    • 'joint': Normalized by total number of messages sent by all users.

    • 'sender': Normalized per sender by total number of messages sent by user.

    • 'receiver': Normalized per receiver by total number of messages sent by user.

  • title (str, optional) – Title for plot. Defaults to “Response matrix”.

Returns:

plotly.graph_objs.Figure – Plotly Figure.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_message_responses_heatmap()
>>> plot(fig)
user_msg_length_boxplot(title: str = 'User message length', xlabel: str = 'User') Figure[source]

Generate figure with boxplots of each user’s message length.

Parameters:
  • title (str, optional) – Title for plot. Defaults to “User message length”.

  • xlabel (str, optional) – x-axis label title. Defaults to “User”.

Returns:

dict – Dictionary with data and layout. Plotly compatible.

Example

>>> from whatstk import WhatsAppChat
>>> from whatstk.graph import plot, FigureBuilder
>>> from whatstk.data import whatsapp_urls
>>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM)
>>> fig = FigureBuilder(chat=chat).user_msg_length_boxplot()
>>> plot(fig)
property usernames: BaseChat

Get list with users available in given chat.

Returns:

list – List with usernames available in chat DataFrame.

whatstk.graph.plot(figure_or_data, show_link=False, link_text='Export to plot.ly', validate=True, output_type='file', include_plotlyjs=True, filename='temp-plot.html', auto_open=True, image=None, image_filename='plot_image', image_width=800, image_height=600, config=None, include_mathjax=False, auto_play=True, animation_opts=None)[source]

Create a plotly graph locally as an HTML document or string.

Example: ``` from plotly.offline import plot import plotly.graph_objs as go

plot([go.Scatter(x=[1, 2, 3], y=[3, 2, 6])], filename=’my-graph.html’) # We can also download an image of the plot by setting the image parameter # to the image format we want plot([go.Scatter(x=[1, 2, 3], y=[3, 2, 6])], filename=’my-graph.html’,

image=’jpeg’)

``` More examples below.

figure_or_data – a plotly.graph_objs.Figure or plotly.graph_objs.Data or

dict or list that describes a Plotly graph. See https://plot.ly/python/ for examples of graph descriptions.

Keyword arguments: show_link (default=False) – display a link in the bottom-right corner of

of the chart that will export the chart to Plotly Cloud or Plotly Enterprise

link_text (default=’Export to plot.ly’) – the text of export link validate (default=True) – validate that all of the keys in the figure

are valid? omit if your version of plotly.js has become outdated with your version of graph_reference.json or if you need to include extra, unnecessary keys in your figure.

output_type (‘file’ | ‘div’ - default ‘file’) – if ‘file’, then

the graph is saved as a standalone HTML file and plot returns None. If ‘div’, then plot returns a string that just contains the HTML <div> that contains the graph and the script to generate the graph. Use ‘file’ if you want to save and view a single graph at a time in a standalone HTML file. Use ‘div’ if you are embedding these graphs in an HTML file with other graphs or HTML markup, like a HTML report or an website.

include_plotlyjs (True | False | ‘cdn’ | ‘directory’ | path - default=True)

Specifies how the plotly.js library is included in the output html file or div string.

If True, a script tag containing the plotly.js source code (~3MB) is included in the output. HTML files generated with this option are fully self-contained and can be used offline.

If ‘cdn’, a script tag that references the plotly.js CDN is included in the output. HTML files generated with this option are about 3MB smaller than those generated with include_plotlyjs=True, but they require an active internet connection in order to load the plotly.js library.

If ‘directory’, a script tag is included that references an external plotly.min.js bundle that is assumed to reside in the same directory as the HTML file. If output_type=’file’ then the plotly.min.js bundle is copied into the directory of the resulting HTML file. If a file named plotly.min.js already exists in the output directory then this file is left unmodified and no copy is performed. HTML files generated with this option can be used offline, but they require a copy of the plotly.min.js bundle in the same directory. This option is useful when many figures will be saved as HTML files in the same directory because the plotly.js source code will be included only once per output directory, rather than once per output file.

If a string that ends in ‘.js’, a script tag is included that references the specified path. This approach can be used to point the resulting HTML file to an alternative CDN.

If False, no script tag referencing plotly.js is included. This is useful when output_type=’div’ and the resulting div string will be placed inside an HTML document that already loads plotly.js. This option is not advised when output_type=’file’ as it will result in a non-functional html file.

filename (default=’temp-plot.html’) – The local filename to save the

outputted chart to. If the filename already exists, it will be overwritten. This argument only applies if output_type is ‘file’.

auto_open (default=True) – If True, open the saved file in a

web browser after saving. This argument only applies if output_type is ‘file’.

image (default=None |’png’ |’jpeg’ |’svg’ |’webp’) – This parameter sets

the format of the image to be downloaded, if we choose to download an image. This parameter has a default value of None indicating that no image should be downloaded. Please note: for higher resolution images and more export options, consider making requests to our image servers. Type: help(py.image) for more details.

image_filename (default=’plot_image’) – Sets the name of the file your

image will be saved to. The extension should not be included.

image_height (default=600) – Specifies the height of the image in px. image_width (default=800) – Specifies the width of the image in px. config (default=None) – Plot view options dictionary. Keyword arguments

show_link and link_text set the associated options in this dictionary if it doesn’t contain them already.

include_mathjax (False | ‘cdn’ | path - default=False) –

Specifies how the MathJax.js library is included in the output html file or div string. MathJax is required in order to display labels with LaTeX typesetting.

If False, no script tag referencing MathJax.js will be included in the output. HTML files generated with this option will not be able to display LaTeX typesetting.

If ‘cdn’, a script tag that references a MathJax CDN location will be included in the output. HTML files generated with this option will be able to display LaTeX typesetting as long as they have internet access.

If a string that ends in ‘.js’, a script tag is included that references the specified path. This approach can be used to point the resulting HTML file to an alternative CDN.

auto_play (default=True) – Whether to automatically start the animation

sequence on page load if the figure contains frames. Has no effect if the figure does not contain frames.

animation_opts (default=None) – Dict of custom animation parameters that

are used for the automatically started animation on page load. This dict is passed to the function Plotly.animate in Plotly.js. See https://github.com/plotly/plotly.js/blob/master/src/plots/animation_attributes.js for available options. Has no effect if the figure does not contain frames, or auto_play is False.

Example: ``` from plotly.offline import plot figure = {‘data’: [{‘x’: [0, 1], ‘y’: [0, 1]}],

‘layout’: {‘xaxis’: {‘range’: [0, 5], ‘autorange’: False},

‘yaxis’: {‘range’: [0, 5], ‘autorange’: False}, ‘title’: ‘Start Title’},

‘frames’: [{‘data’: [{‘x’: [1, 2], ‘y’: [1, 2]}]},

{‘data’: [{‘x’: [1, 4], ‘y’: [1, 4]}]}, {‘data’: [{‘x’: [3, 4], ‘y’: [3, 4]}],

‘layout’: {‘title’: ‘End Title’}}]}

plot(figure, animation_opts={‘frame’: {‘duration’: 1}}) ```