Corpus¶
The Corpus app is concerned with the representation of raw message data and its associated metadata.
Models¶
-
class
msgvis.apps.corpus.models.Dataset(*args, **kwargs)[source]¶ A top-level dataset object containing messages.
-
name= None¶ The name of the dataset
-
description= None¶ A description of the dataset.
-
created_at= None¶ The
datetime.datetimewhen the dataset was created.
-
start_time= None¶ The time of the first real message in the dataset
-
end_time= None¶ The time of the last real message in the dataset
-
-
class
msgvis.apps.corpus.models.MessageType(*args, **kwargs)[source]¶ The type of a message, e.g. retweet, reply, original, system...
-
name= None¶ The name of the message type
-
-
class
msgvis.apps.corpus.models.Language(*args, **kwargs)[source]¶ Represents the language of a message or a user
-
code= None¶ A short language code like ‘en’
-
name= None¶ The full name of the language
-
-
class
msgvis.apps.corpus.models.Url(*args, **kwargs)[source]¶ A url from a message
-
domain= None¶ The root domain of the url
-
short_url= None¶ A shortened url
-
full_url= None¶ The full url
-
-
class
msgvis.apps.corpus.models.Hashtag(*args, **kwargs)[source]¶ A hashtag in a message
-
text= None¶ The text of the hashtag, without the hash
-
-
class
msgvis.apps.corpus.models.Media(*args, **kwargs)[source]¶ Linked media, e.g. photos or videos.
-
type= None¶ The kind of media this is.
-
media_url= None¶ A url where the media may be accessed
-
-
class
msgvis.apps.corpus.models.Timezone(*args, **kwargs)[source]¶ The timezone of a message or user
-
olson_code= None¶ The timezone code from pytz.
-
name= None¶ Another name for the timezone, perhaps the country where it is located?
-
-
class
msgvis.apps.corpus.models.Person(*args, **kwargs)[source]¶ A person who sends messages in a dataset.
-
original_id= None¶ An external id for the person, e.g. a user id from Twitter
-
username= None¶ Username is a short system-y name.
-
full_name= None¶ Full name is a longer user-friendly name
-
message_count= None¶ The number of messages the person produced
-
replied_to_count= None¶ The number of times the person’s messages were replied to
The number of times the person’s messages were shared or retweeted
-
mentioned_count= None¶ The number of times the person was mentioned in other people’s messages
-
friend_count= None¶ The number of people this user has connected to
-
follower_count= None¶ The number of people who have connected to this person
-
profile_image_url= None¶ The person’s profile image url
-
-
class
msgvis.apps.corpus.models.Message(*args, **kwargs)[source]¶ The Message is the central data entity for the dataset.
-
original_id= None¶ An external id for the message, e.g. a tweet id from Twitter
-
type¶ The
MessageTypeMessage type: retweet, reply, origin...
-
time= None¶ The
datetime.datetime(in UTC) when the message was sent
-
sentiment= None¶ The sentiment label for message.
-
replied_to_count= None¶ The number of replies this message received.
The number of times this message was shared or retweeted.
The set of
Hashtagin the message.
-
text= None¶ The actual text of the message.
-