whatstk package¶
Subpackages¶
Submodules¶
whatstk.data module¶
Load sample chats.
Tthis module contains the links to currently online-available chats. For more details, please refer to the source code.
Classes
Urls (POKEMON, LOREM, LOREM1, LOREM2, LOREM_2000) |
-
class
whatstk.data.
Urls
(POKEMON, LOREM, LOREM1, LOREM2, LOREM_2000)¶ Bases:
tuple
Attributes
LOREM
Alias for field number 1 LOREM1
Alias for field number 2 LOREM2
Alias for field number 3 LOREM_2000
Alias for field number 4 POKEMON
Alias for field number 0 -
property
LOREM
¶ Alias for field number 1
-
property
LOREM1
¶ Alias for field number 2
-
property
LOREM2
¶ Alias for field number 3
-
property
LOREM_2000
¶ Alias for field number 4
-
property
POKEMON
¶ Alias for field number 0
-
property
Module contents¶
Python wrapper and analysis tools for WhatsApp chats.
This library provides a powerful wrapper for multiple Languages and OS. In addition, analytics tools are provided.
Classes
WhatsAppChat (df) |
Load and process a WhatsApp chat file. |
FigureBuilder ([df, chat]) |
Generate a variety of figures from your loaded chat. |
-
class
whatstk.
WhatsAppChat
(df)[source]¶ Bases:
whatstk._chat.BaseChat
Load and process a WhatsApp chat file.
Parameters: df (pandas.DataFrame) – Chat. Methods
from_source
(filepath, **kwargs)Create an instance from a chat text file. from_sources
(filepaths[, auto_header, …])Load a WhatsAppChat instance from multiple sources. to_txt
(filepath[, hformat])Export chat to a text file. Example
This simple example loads a chat using
WhatsAppChat
. Once loaded, we can access its attributedf
, which contains the loaded chat as a DataFrame.>>> from whatstk.whatsapp.objects import WhatsAppChat >>> from whatstk.data import whatsapp_urls >>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.POKEMON) >>> chat.df.head(5) date username message 0 2016-08-06 13:23:00 Ash Ketchum Hey guys! 1 2016-08-06 13:25:00 Brock Hey Ash, good to have a common group! 2 2016-08-06 13:30:00 Misty Hey guys! Long time haven't heard anything fro... 3 2016-08-06 13:45:00 Ash Ketchum Indeed. I think having a whatsapp group nowada... 4 2016-08-06 14:30:00 Misty Definetly
-
classmethod
from_source
(filepath, **kwargs)[source]¶ Create an instance from a chat text file.
Parameters: - filepath (str) – Path to the file. It can be a local file (e.g. ‘path/to/file.txt’) or an URL to a hosted file (e.g. ‘http://www.url.to/file.txt’)
- **kwargs – Refer to the docs from
df_from_txt_whatsapp
for details on additional arguments.
Returns: WhatsAppChat – Class instance with loaded and parsed chat.
-
classmethod
from_sources
(filepaths, auto_header=None, hformat=None, encoding='utf-8')[source]¶ Load a WhatsAppChat instance from multiple sources.
Parameters: - filepaths (list) – List with filepaths.
- auto_header (bool, optional) – Detect header automatically (applies to all files). If None, attempts to
perform automatic header detection for all files. If False,
hformat
is required. - hformat (list, optional) – List with the header format to be used for each file.
The list must be of length equal to
len(filenames)
. A valid header format might be ‘[%y-%m-%d %H:%M:%S] - %name:’. - encoding (str) – Encoding to use for UTF when reading/writing (ex. ‘utf-8’). List of Python standard encodings.
Returns: WhatsAppChat – Class instance with loaded and parsed chat.
See also
Example
Load a chat using two text files. In this example, we use sample chats (available online, see urls in source code
whatstk.data
).>>> from whatstk.whatsapp.objects import WhatsAppChat >>> from whatstk.data import whatsapp_urls >>> filepath_1 = whatsapp_urls.LOREM1 >>> filepath_2 = whatsapp_urls.LOREM2 >>> chat = WhatsAppChat.from_sources(filepaths=[filepath_1, filepath_2]) >>> chat.df.head(5) date username message 0 2019-10-20 10:16:00 John Laborum sed excepteur id eu cillum sunt ut. 1 2019-10-20 11:15:00 Mary Ad aliquip reprehenderit proident est irure mo... 2 2019-10-20 12:16:00 +1 123 456 789 Nostrud adipiscing ex enim reprehenderit minim... 3 2019-10-20 12:57:00 +1 123 456 789 Deserunt proident laborum exercitation ex temp... 4 2019-10-20 17:28:00 John Do ex dolor consequat tempor et ex.
-
to_txt
(filepath, hformat=None)[source]¶ Export chat to a text file.
Usefull to export the chat to different formats (i.e. using different hformats).
Parameters: - filepath (str) – Name of the file to export (must be a local path).
- hformat (str, optional) – Header format. Defaults to ‘%y-%m-%d, %H:%M - %name:’.
-
classmethod
-
class
whatstk.
FigureBuilder
(df=None, chat=None)[source]¶ Bases:
object
Generate a variety of figures from your loaded chat.
Integrates feature extraction and visualization logic to automate data plots.
Note: Either
df
orchat
must be provided.Parameters: - df (pandas.DataFrame, optional) – Chat data. Atribute df of a chat loaded using Chat. If a value is given,
chat
is ignored. - chat (Chat, optional) – Chat data. Object obtained when chat loaded using Chat. Required if
df
is None.
Attributes
user_color_mapping
Get mapping between user and color. usernames
Get list with users available in given chat. Methods
user_interventions_count_linechart
([…])Plot number of user interventions over time. user_message_responses_flow
([title])Get the flow of message responses. user_message_responses_heatmap
([norm, title])Get the response matrix heatmap. user_msg_length_boxplot
([title, xlabel])Generate figure with boxplots of each user’s message length. -
property
user_color_mapping
¶ Get mapping between user and color.
Each user is assigned a color automatically, so that this color is preserved for that user in all to-be-generated plots.
Returns: dict – Mapping from username to color (rgb).
-
user_interventions_count_linechart
(date_mode='date', msg_length=False, cumulative=False, all_users=False, title='User interventions count', xlabel='Date/Time', cummulative=None)[source]¶ Plot number of user interventions over time.
Parameters: - date_mode (str, optional) –
Choose mode to group interventions by. Defaults to
'date'
. Available modes are:'date'
: Grouped by particular date (year, month and day).'hour'
: Grouped by hours.'month'
: Grouped by months.'weekday'
: Grouped by weekday (i.e. monday, tuesday, …, sunday).'hourweekday'
: Grouped by weekday and hour.
- msg_length (bool, optional) – Set to True to count the number of characters instead of number of messages sent.
- cumulative (bool, optional) – Set to True to obtain commulative counts.
- all_users (bool, optional) – Obtain number of interventions of all users combined. Defaults to False.
- title (str, optional) – Title for plot. Defaults to “User interventions count”.
- xlabel (str, optional) – x-axis label title. Defaults to “Date/Time”.
- cummulative (bool, optional) – Deprecated, use cumulative.
Returns: plotly.graph_objs.Figure – Plotly Figure.
See also
Example
>>> from whatstk import WhatsAppChat >>> from whatstk.graph import plot, FigureBuilder >>> from whatstk.data import whatsapp_urls >>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM) >>> fig = FigureBuilder(chat=chat).user_interventions_count_linechart(cumulative=True) >>> plot(fig)
- date_mode (str, optional) –
-
user_message_responses_flow
(title='Message flow')[source]¶ Get the flow of message responses.
A response from user X to user Y happens if user X sends a message right after a message from user Y.
Uses a Sankey diagram.
Parameters: title (str, optional) – Title for plot. Defaults to “Message flow”. Returns: plotly.graph_objs.Figure – Plotly Figure. See also
Example
>>> from whatstk import WhatsAppChat >>> from whatstk.graph import plot, FigureBuilder >>> from whatstk.data import whatsapp_urls >>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM) >>> fig = FigureBuilder(chat=chat).user_message_responses_flow() >>> plot(fig)
-
user_message_responses_heatmap
(norm='absolute', title='Response matrix')[source]¶ Get the response matrix heatmap.
A response from user X to user Y happens if user X sends a message right after a message from user Y.
Parameters: - norm (str, optional) –
Specifies the type of normalization used for reponse count. Can be:
'absolute'
: Absolute count of messages.'joint'
: Normalized by total number of messages sent by all users.'sender'
: Normalized per sender by total number of messages sent by user.'receiver'
: Normalized per receiver by total number of messages sent by user.
- title (str, optional) – Title for plot. Defaults to “Response matrix”.
Returns: plotly.graph_objs.Figure – Plotly Figure.
See also
Example
>>> from whatstk import WhatsAppChat >>> from whatstk.graph import plot, FigureBuilder >>> from whatstk.data import whatsapp_urls >>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM) >>> fig = FigureBuilder(chat=chat).user_message_responses_heatmap() >>> plot(fig)
- norm (str, optional) –
-
user_msg_length_boxplot
(title='User message length', xlabel='User')[source]¶ Generate figure with boxplots of each user’s message length.
Parameters: - title (str, optional) – Title for plot. Defaults to “User message length”.
- xlabel (str, optional) – x-axis label title. Defaults to “User”.
Returns: dict – Dictionary with data and layout. Plotly compatible.
See also
Example
>>> from whatstk import WhatsAppChat >>> from whatstk.graph import plot, FigureBuilder >>> from whatstk.data import whatsapp_urls >>> chat = WhatsAppChat.from_source(filepath=whatsapp_urls.LOREM) >>> fig = FigureBuilder(chat=chat).user_msg_length_boxplot() >>> plot(fig)
-
property
usernames
¶ Get list with users available in given chat.
Returns: list – List with usernames available in chat DataFrame.
- df (pandas.DataFrame, optional) – Chat data. Atribute df of a chat loaded using Chat. If a value is given,