Skip to content

Database Structure

Entity-Relationship Overview

  • Tickers
    Stores stock symbols and related metadata.

  • Posts
    Contains posts or articles related to a ticker. This now also includes an optional author field.

  • Points
    Represents extracted investment thesis points from posts. Each point includes a sentiment score, the extracted text, and an embedding vector for semantic similarity, along with a flag indicating if any criticism exists.

  • Criticisms
    Contains valid criticisms linked to a specific point, optionally referencing the comment that inspired the criticism.

  • Comments
    Stores individual comments or feedback related to posts, which may also be referenced by criticisms.


Tables and Fields

1. Tickers

Stores information about each stock ticker.

  • Fields:

    • id: Primary Key, Integer.

    • symbol: String (up to 10 characters), Unique, Not Null.

    • name: String (up to 100 characters).

    • description: Text – an AI-generated description based on financial data.

    • overall_sentiment_score: Integer, constrained with a check (“overall_sentiment_score BETWEEN 1 and 100”), Nullable.

    • last_analyzed: DateTime – timestamp of the last analysis.

    • description_last_analyzed: DateTime – timestamp of the last description update.

  • Relationships:

    • Posts: One-to-many relationship with Posts (cascade delete enabled).

    • Points: One-to-many relationship with Points (cascade delete enabled).

  • Additional Note:

    • A custom __repr__ method is defined for debugging and logging purposes.

2. Posts

Contains posts or articles from various sources related to a ticker.

  • Fields:

    • id: Primary Key, Integer.

    • ticker_id: Foreign Key referencing tickers.id, Not Null.

    • source: String (up to 50 characters), Not Null – e.g., Reddit, Substack.

    • title: String (up to 250 characters), Nullable.

    • author: String (up to 100 characters), Nullable – represents the author of the post.

    • link: Text – URL to the original post.

    • date_of_post: DateTime – publication timestamp.

    • content: Text – full content of the post.

  • Relationships:

    • Ticker: Belongs to a single Ticker.

    • Points: One-to-many relationship with Points (cascade delete enabled).

    • Comments: One-to-many relationship with Comments (cascade delete enabled).

  • Additional Note:

    • A custom __repr__ method provides a concise string representation of each post.

3. Points

Extracted investment thesis points from posts.

  • Fields:

    • id: Primary Key, Integer.

    • ticker_id: Foreign Key referencing tickers.id, Not Null.

    • post_id: Foreign Key referencing posts.id, Not Null.

    • sentiment_score: Integer, with a check constraint (“sentiment_score BETWEEN 1 and 100”), Not Null.

    • text: Text, Not Null – the extracted point or thesis.

    • criticism_exists: Boolean, default set to False – indicates whether any valid criticism is linked.

    • embedding: ARRAY(Float) – stores a semantic embedding vector for similarity comparisons.

  • Relationships:

    • Ticker: Belongs to a Ticker.

    • Post: Linked to the originating Post.

    • Criticisms: One-to-many relationship with Criticisms (cascade delete enabled).

  • Additional Note:

    • The custom __repr__ method returns a brief representation of the point including its sentiment score and a preview of the text.

4. Criticisms

Stores valid criticisms associated with specific points.

  • Fields:

    • id: Primary Key, Integer.

    • point_id: Foreign Key referencing points.id, Not Null.

    • comment_id: Foreign Key referencing comments.id, Nullable – if the criticism is linked to a particular comment.

    • text: Text, Not Null – the content of the criticism.

    • date_posted: DateTime – timestamp indicating when the criticism was recorded.

    • validity_score: Integer, with a check constraint (“validity_score BETWEEN 1 and 100”), Nullable – quantifies the strength or validity of the criticism.

  • Relationships:

    • Point: Belongs to a specific Point.

    • Comment: Optionally linked to a Comment from which the criticism originated.

  • Additional Note:

    • A custom __repr__ method provides a concise summary of the criticism.

5. Comments

Contains individual comments or feedback linked to posts.

  • Fields:

    • id: Primary Key, Integer.

    • post_id: Foreign Key referencing posts.id, Not Null.

    • content: Text, Not Null – the comment’s text.

    • link: Text – optional URL or reference.

    • author: String (up to 100 characters), Nullable – the name of the comment’s author.

  • Relationships:

    • Post: Belongs to a specific Post.

    • Criticisms: One-to-many relationship with Criticisms (cascade delete enabled), allowing a comment to be referenced by one or more criticisms.

  • Additional Note:

    • A custom __repr__ method is defined for ease of debugging and logging.

Sample Data Entries

Tickers

idsymbolnamedescriptionoverall_sentiment_scorelast_analyzeddescription_last_analyzed
1AAPLApple Inc.852023-10-15 14:352023-10-10 14:40
2TSLATesla, Inc.702023-10-15 14:352023-10-10 12:30

Posts

idticker_idsourcetitleauthorlinkdate_of_postcontent
11RedditApple’s Q4 EarningsAnalystAhttp://reddit.com/post12023-10-15 14:35We should all invest in Apple…
22SubstackTesla’s New ModelAnalystBhttp://substack.com/p22023-10-15 14:35I think TSLA is overvalued…

Points

idticker_idpost_idsentiment_scoretextcriticism_existsembedding
11185”Strong sales in Q4”False[0.134, 0.298, -0.076, 0.415, 0.092, …]
22232”Concerns over production”True[-0.213, 0.412, 0.103, -0.349, 0.287, …]

Criticisms

idpoint_idcomment_idtextdate_postedvalidity_score
125”Production issues are temporary”2023-10-14 12:0080

Comments

idpost_idcontentlinkauthor
52”Industry reports suggest that production issues are only short-term.”http://example.com/comment5AnalystX