Vivienne Prince

Logo

Thanks for dropping by my page!

I'm currently a Masters of Data Science student graduating in 2022.

I really enjoy learning new tools, exploring data to help solve problems, and helping others understand ideas and data.

GitHub  •  LinkedIn

Reddit Cryptocurrency Historical Sentiment vs Price
November 2020  –  with Kalani Stanton and Tim McCormack

Overview:
We used user post and comment data from Reddit’s two most popular crypo subreddits to explore change in average sentiment over time for BTC, ETH, LTC, and XRP (ripple) in context with the currencies’ value.

Tools:
Python Reddit API Wrapper: pull reddit comment data
Tidyverse/ Tidytext: [NLP] tokenization and sentiment analysis
Quandl: pull coin price history
ggplot/ plotly: viz

Language used: R/ Markdown

For full details see our git repo :-)


This is a time series sentiment vs price data visualization I put together for our project:

My code:

# function for getting Coin Price/ Sentiment vs Time graph by coin 

get_graph <- function(coin, coeff) {   # coin = "COIN_NAME", coeff = Value used to transform sentiment to match price scale on graph
  
  # get related coin data
  coinprice_data <- PricesByCoin %>% filter(Coin == coin)
  coin_sntmntByMonth <- sntmntByMonth %>% filter(Coin == coin)
  
  # reshape coin price by day data to merge high, low, last, med into one variable
  price_by_mkt_metric <- melt(coinprice_data, id = c("Date", "Coin"))
  colnames(price_by_mkt_metric)[3] <- "Mkt_Metrics"
  
  # normalize x-values for both datasets (date)
  price_by_mkt_metric$Date <- as_date(price_by_mkt_metric$Date)
  coin_sntmntByMonth$Month <- as_date(coin_sntmntByMonth$Month)
  

  # make the gg plot
  Coin_Daily_Price.plot <- price_by_mkt_metric %>%
    filter((Mkt_Metrics %in% c("High", "Low", "Last"))) %>%
    
    # ggplot setup
    ggplot(aes(x = Date)) +
    theme_minimal() +
    ggtitle(paste(coin, " Sentiment/ Price vs Time")) +
    xlab("Date") +
    theme(legend.title = element_blank())  +
    
    
    # plot price vs time lines
    geom_line(
      stat = 'identity',
      aes(
        y = value,
        linetype = Mkt_Metrics,
        color = Mkt_Metrics,
        size = Mkt_Metrics,
        alpha = Mkt_Metrics)) +
    scale_linetype_manual("Market Metrics", values = c("solid", "solid", "solid")) +
    scale_color_manual("Market Metrics", values = c('#EF9A9A', '#C5E1A5', '#212121')) +
    scale_size_manual("Market Metrics", values = c(1, 1, 0.3)) +
    scale_alpha_manual("Market Metrics", values = c(0.8, 0.8, 1)) +
    
    
    # plot sentiment bars (stacked)
    geom_bar(
      data = coin_sntmntByMonth,
      stat = 'identity',
      aes(
        x = Month,
        y = n / coeff,
        fill = sentiment)) +
    
    
    # setup y-axises
    scale_y_continuous(name = "Price (USD)",
                       sec.axis = sec_axis( ~ . * coeff,  name = "Sentiment (n)"))
  
  # convert to plotly
  Coin_Daily_Price.plotly = ggplotly(Coin_Daily_Price.plot, tooltip = c("label","x","y"))
  
  # cleans up legend labels
  for (i in 1:length(Coin_Daily_Price.plotly$x$data)) {
    if (!is.null(Coin_Daily_Price.plotly$x$data[[i]]$name)) {
      Coin_Daily_Price.plotly$x$data[[i]]$name =  gsub("\\(", "",
          str_split(Coin_Daily_Price.plotly$x$data[[i]]$name, ",")[[1]][1])
    }
  }
  
  Coin_Daily_Price.plot
  Coin_Daily_Price.plotly

}

get_graph("BTC", 4)
get_graph("ETH",20)
get_graph("LTC", 19)
get_graph("XRP", 8000)