Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
FinTech Project Guidelines
Cryptocurrency Investment Product
2025
Project Overview
This FinTech project is divided into two main parts that build on each other to create a complete cryptocurrency portfolio optimization system. You will use both structured data (historical cryptocurrency price and volume data) and unstructured textual data (news articles) from CryptoCompare (CoinDesk) to create investment factors, build models that optimize portfolio weights and create systematic trading strategies enhanced by sentiment analysis.
Data Sources:
• Structured Data: Historical cryptocurrency price and volume data from Crypto- Compare API
• Unstructured Data: News articles from CryptoCompare News API Structure:
• Part A: Data Design and Analysis Report (20% of total course grade) due Week 8, 25th July Friday, 5:00pm
• Part B: Model Design and Implementation (50% of total course grade) due Week 10, 8th August Friday, 5:00pm
Target Audience: Team of VCs, private equity groups, project managers
Report Length Part A: Maximum 5-6 pages per part with unlimited Appendix and Reference pages.
Report Length Part B: Maximum 7-8 pages per part with unlimited Appendix and Reference pages. UNSW referencing guidelines must be followed at all times.
Note: Page length above is the maximum – you do not need to hand-in this many pages.
FinTech Project Part A: Data Design and Analysis Report
Due Date: Week 8, 25th July Friday, 5:00pm
Weight: 20% of grade (10% Station #1 + 10% Station #2)
Station #1: Data Collection and Processing (10%)
Your task is to design and implement the data foundation for your cryptocurrency port- folio optimization system using both structured and unstructured data sources.
The below is only intended as a guide. You are strongly encouraged to come-up with your own report template and design. These points below are not compulsory and are not graded as a check-list – they are only to help guide you.
a) Product Description Describe what your cryptocurrency portfolio optimization product will do. Explain the main purpose and how it will help users make investment decisions by combining market data with sentiment analysis from news. For example, will it automatically rebalance crypto portfolios based on both technical indicators and news sentiment, or provide optimal allocation recommendations across different cryptocurren- cies enhanced by market sentiment insights? Will it focus only on building investment factors such as short-term reversal, momentum etc.?
b) Product Basic Use and Interface - Online or Offline? Explain how your product will work:
• Will it work in real-time with live cryptocurrency market data and news feeds?
– Discuss your product’s use of an API. Will it use a GET connection or a websocket? What are the benefits/drawbacks of either method?
• Will it analyze both historical crypto data and news articles for backtesting strate- gies?
– Will the product offer investment factors (short-term reversal, momentum etc.) as well as an optimization service?
• Will users interact with it through a website, or mobile app?
• How often will it update portfolio allocations considering both price movements and sentiment shifts?
– Weekly rebalancing is recommended – but you can choose any frequency.
c) Input Data - Structured and Unstructured Specify exactly what data you need:
Structured Data:
• Cryptocurrency price data (open, high, low, close, volume) for multiple crypto assets
• Trading volume information
• Time period you will analyze
• Data frequency
• Cryptocurrency symbols for your chosen universe (BTC, ETH, ADA, etc.)
• Table of descriptive statistics Unstructured Data:
• News articles from CryptoCompare News API
• Article metadata (publication date, source, tags, cryptocurrency mentions)
• Text content (headlines, article body, summaries)
• News frequency and volume over time
• Cryptocurrency-specific news filtering and categorization
d) Data Processing Requirements Explain what needs to be calculated, stored, and cleaned:
Structured Data Processing:
• How you will handle missing data points and cryptocurrency delistings
• Methods for detecting and removing outliers in crypto price data
• Calculations for daily returns
• Data storage format and organization across multiple crypto assets (Long or wide format?)
• Data quality checks you will perform Unstructured Data Processing:
• Text cleaning and preprocessing pipeline using the CryptoCompare News API
• Handling of different news sources and article formats
• Methods for extracting cryptocurrency mentions from articles
• Dealing with duplicate articles and spam content
• Text normalization and standardization procedures
• Storage format for processed news data
e) Output Data Describe your final clean datasets:
Structured Data Output:
• File format and structure for multi-crypto data
• Variables/columns included (returns, prices, volumes, market cap)
• Expected number of observations per cryptocurrency
• How the data will be organized for portfolio analysis
• Return matrix structure for optimization algorithms Unstructured Data Output:
• Cleaned and processed news article dataset structure
• Text features and metadata preserved
• Cryptocurrency-news linkage methodology
• Expected number of articles per cryptocurrency and time period
• Data format ready for sentiment analysis in Station #2
f) Station #1 Definition In a few sentences, explain how you would define Station #1 from a Data Management Perspective specifically for the context of the application that you are building. Focus on its role as the foundation of your cryptocurrency portfolio optimization system that integrates both structured market data and unstructured news data.
Station #2: Feature Engineering and Text Processing (10%)
Your task is to create meaningful variables from both the raw cryptocurrency data and processed news articles for factor construction and portfolio optimization.
a) Input Requirements List what inputs you need to create useful features for your crypto portfolio strategy:
• Clean price and return data from Station #1
• Volume data
• Processed news articles and metadata from Station #1
• Specific time periods for calculating rolling statistics
• Crypto market benchmark
b) Data Collection and Format Requirements Specify your requirements for data collection and formats:
• How data should be formatted (timestamp alignment across crypto assets and news)
• Frequency of data collection and rebalancing
• How to handle trading hours and timezone differences
• Synchronization between market data and news publication times
• File organization and naming conventions for multi-crypto and news data
c) Core Features - Structured Data Describe the main features you will create from the raw cryptocurrency data:
• Return features: Historical returns, rolling means, momentum indicators
• Risk measures: Volatility, Value-at-Risk, maximum drawdown
• Many other can be created
d) Text Processing Pipeline (4-5 Stages) Describe your systematic approach to cleaning and processing the unstructured news data:
Stage 1: Basic Text Cleaning
• Removing HTML tags, special characters, and formatting artifacts
• Converting text to lowercase
• Handling encoding issues and non-ASCII characters
Stage 2: Text Normalization
• Tokenization of article text
• Removing stop words and common phrases
• Handling cryptocurrency-specific terminology and abbreviations
Stage 3: Content Filtering
• Identifying and extracting cryptocurrency mentions
• Filtering relevant vs. irrelevant news content
• Removing duplicate or near-duplicate articles
Stage 4: Feature Extraction
• Creating text-based features (word counts, article length, etc.)
• Extracting temporal patterns in news coverage
• Building cryptocurrency-specific news volume metrics
Stage 5: Data Integration
• Aligning news data with market data timestamps
• Creating combined datasets for analysis
• Quality assurance and validation checks For each structured data feature type, explain:
• Why it might be useful for cryptocurrency factor creation / portfolio optimization
• How you will calculate it (include equations!)
• What you expect the feature values to look like For each text processing stage, explain:
• Specific techniques and tools you will use
• Expected challenges and how you will address them
• Quality metrics for evaluating processing success
e) Back-and-forth between Station #1 and Station #2 Definition
• Once you compute additional features from both structured and unstructured data, do you catch any additional data errors?
• Describe the linkages between Station 1 and Station 2 for both data types.
• How does text processing reveal issues with the original news data collection?
f) Station #2 Definition In a few sentences, explain how you would define Station #2 from a Data Management Perspective. Focus on how it transforms both raw cryptocur- rency data and unstructured news text into useful predictors for portfolio optimization. Be specific about the dual nature of your feature engineering process.
FinTech Project Part B: Model Design and Implementation
Weight: 50% of grade (25% Station #3 + 25% Station #4)
Station #3: Model Design and Sentiment Analysis (25%)
Your task is to implement portfolio optimization models enhanced with sentiment analysis for creating systematic cryptocurrency allocation strategies.
a) Model Selection and Design Describe what models you have implemented and plan to use:
• Factor creation: Will you create cryptocurrency risk-factors enhanced with sen- timent indicators?
• Sentiment Analysis: Implementation of VADER sentiment model for news anal- ysis
• Sentiment Indexing: Construction of overall market sentiment index and cryptocurrency- specific sentiment metrics
• Classical optimization: Mean-variance optimization, minimum variance portfo- lios
• Risk parity models: Equal risk contribution, Equal weighted portfolio
• How you will combine sentiment signals with traditional optimization approaches
• Your approach for dynamic rebalancing incorporating both price and sentiment factors
• Will you incorporate transaction costs and cryptocurrency trading constraints?
b) Sentiment Analysis Implementation Detail your approach to building sentiment tools:
VADER Sentiment Model:
• Implementation of VADER (Valence Aware Dictionary and sEntiment Reasoner) for cryptocurrency news
• Adaptation of VADER for cryptocurrency-specific terminology and context
• Processing pipeline for converting news articles to sentiment scores
• Handling of negations, intensifiers, and crypto-specific language patterns
Sentiment Index Construction:
• Methodology for aggregating individual article sentiment scores
• Time-weighted sentiment averaging and rolling windows
• Overall market sentiment index calculation
• Cryptocurrency-by-cryptocurrency sentiment metrics
• Handling of news volume variations and sentiment score normalization
Integration with Portfolio Models:
• How sentiment scores will be incorporated into portfolio optimization
• Weighting schemes for combining sentiment with traditional factors
• Dynamic adjustment of portfolio weights based on sentiment shifts
c) Model Assumptions List and explain your key assumptions:
• What you assume about cryptocurrency market behavior and return distributions
• How you expect your structured and unstructured features to relate to future crypto performance
• Assumptions about the predictive power of news sentiment for crypto markets
• Expected relationship between sentiment and price movements
• Assumptions about crypto market efficiency and alpha generation from sentiment analysis
• What transaction costs and portfolio constraints you consider (exchange fees, slip- page)
d) Technical Constraints and Limitations Identify potential issues with your ap- proach:
• Computing speed requirements for cryptocurrency portfolio optimization with sen- timent processing
• Memory and storage capacity needs for multi-crypto data and large news datasets
• Expected optimization accuracy and convergence issues
• Limitations of VADER sentiment analysis for cryptocurrency-specific content
• Challenges in real-time sentiment processing and integration
• Limitations of your chosen optimization methods in volatile crypto markets en- hanced by sentiment signals
e) Expected Implementation Results Describe what you expect to see when you implement your models:
• Portfolio performance compared to crypto benchmarks (Bitcoin, total market) and sentiment-unaware strategies
• How different optimization methods enhanced with sentiment might perform in crypto markets
• Expected risk-return characteristics with sentiment integration
• Correlation between sentiment metrics and portfolio performance
• Key metrics you will use to evaluate success (Sharpe ratio, maximum drawdown, sentiment-adjusted returns, etc.)
f) Model Boundaries and Risks Explain the limitations of your approach:
• Market conditions where your sentiment-enhanced optimization might fail
• Time horizons beyond which sentiment-based allocations become unreliable
• External factors your models cannot account for (regulatory changes, exchange hacks, news manipulation)
• Risk management and cryptocurrency portfolio constraints considerations
• Potential biases in news sources and sentiment analysis limitations
g) Station #3 Definition In a few sentences, explain how you would define Station
#3 from a Data Management Perspective. Focus on how it uses both historical mar- ket patterns and sentiment analysis from news data to generate optimal cryptocurrency portfolio allocations.
Station #4: Model Implementation (25%)
Your task is to implement your sentiment-enhanced cryptocurrency portfolio optimization strategy and create a working application using Google Gemini.
a) Product Design Describe the final product you will build:
• What the cryptocurrency investment application will look like and do, including sentiment dashboard features
• Who the target users are (retail crypto investors, institutional managers, etc.)
• Key features and functionality (crypto portfolio dashboard, sentiment indicators, rebalancing alerts, news integration, etc.)
• How users will interact with both market data and sentiment analysis components
• Real-time sentiment monitoring and alert systems
b) Implementation Steps Explain the steps you took when implementing this solution:
1. Cryptocurrency portfolio optimization model training and validation
2. VADER sentiment analysis implementation and testing
3. Sentiment index construction and validation
4. Integration of sentiment signals with portfolio optimization
5. Crypto backtesting framework development with sentiment factors
6. Application development approach incorporating sentiment dashboard
7. Google Gemini integration process for enhanced user interaction
8. Testing and debugging procedures for both market and sentiment components
c) Implementation Challenges Describe the difficulties you faced during implemen- tation:
• Technical problems with optimization algorithms, crypto APIs, or sentiment pro- cessing
• Model performance and convergence issues in volatile crypto markets with sentiment integration
• Challenges in real-time sentiment analysis and news processing
• User interface design challenges for displaying both market and sentiment data
• Multi-cryptocurrency data integration problems with news alignment
• VADER model adaptation and cryptocurrency-specific sentiment challenges
• How you solved each challenge
d) Client Introduction Strategy Recommend steps for introducing this sentiment- enhanced cryptocurrency portfolio optimization tool to clients:
1. Pilot testing with paper crypto portfolios including sentiment backtesting
2. Risk assessment and crypto investor education including sentiment analysis expla- nation
3. Demonstration of sentiment analysis value-add and historical performance
4. Gradual rollout strategy with capital limits and sentiment signal validation
5. Performance monitoring and reporting approach including sentiment metrics
6. Support and maintenance plan for both market data and news processing systems
e) Customer Journey Describe the complete user experience:
1. How users discover and access your sentiment-enhanced cryptocurrency application
2. Account setup and crypto risk profiling process including sentiment preferences
3. Daily usage workflow combining crypto portfolio monitoring with sentiment analysis
4. How users interpret sentiment signals and make rebalancing decisions
5. Cryptocurrency portfolio performance tracking and reporting with sentiment attri- bution
6. News monitoring and sentiment alert features
7. Support and help resources for both traditional and sentiment-based features
f) Google Gemini Integration Explain how you used Google Gemini to build your application:
• Specific features Gemini helped create, particularly for sentiment analysis interpre- tation
• How you integrated Gemini API into your crypto application for enhanced user interaction
• What prompts and interactions you designed for sentiment-based portfolio advice
• How Gemini improves the cryptocurrency investment user experience with natural language sentiment explanations
• Integration of Gemini with news analysis and sentiment interpretation
• Any limitations or challenges with Gemini integration for sentiment-enhanced fea- tures
g) Station #4 Definition In a few sentences, explain how you would define Station #4 from a Data Management Perspective. Focus on how it brings together all components (structured data, unstructured data, sentiment analysis, and portfolio optimization) into a working cryptocurrency portfolio management system.
Grading Breakdown
Course Grade Distribution:
• FinTech project Part A: 20% (Station #1: 10% + Station #2: 10%)
• FinTech project Part B: 50% (Station #3: 25% + Station #4: 25%) Evaluation Criteria for Each Station:
• Clear understanding of requirements and objectives for both structured and un- structured data processing
• Appropriate technical approach and methodology for sentiment analysis integration
• Quality of analysis and implementation of both portfolio optimization and sentiment tools
• Professional presentation and documentation
• Innovation and practical applicability of sentiment-enhanced portfolio strategies
Submission Requirements
Part A Deliverables:
• Written report (6-7 pages maximum)
• Python code for cryptocurrency data collection and cleaning (both structured and unstructured)
• Sample of processed multi-crypto data and cleaned news articles
• Text processing pipeline demonstration
• Appendix with plots, tables, and references
Part B Deliverables:
• Written report (10 pages maximum)
• Complete Python code for cryptocurrency portfolio optimization models enhanced with sentiment analysis
• VADER sentiment analysis implementation and sentiment index construction
• Working application with Google Gemini integration including sentiment features
• Crypto backtesting results and performance analysis with sentiment attribution
• Appendix with additional analysis and references
Important Guidelines
• Each part builds on the previous work - start early, especially given the complexity of integrating structured and unstructured data
• Focus on practical implementation of both portfolio optimization and sentiment analysis, not just theory
• Provide clear explanations that cryptocurrency portfolio managers can understand regarding both market and sentiment factors
• Include proper citations for all data sources, methods, and sentiment analysis tech- niques
• Test your optimization algorithms and sentiment models thoroughly and document limitations
• Make your application user-friendly and professional, with clear presentation of both market and sentiment insights
• Validate your sentiment analysis approach with appropriate backtesting and per- formance metrics
Appendices
Appendix A: Data Sources
List of recommended APIs and data sources for both cryptocurrency price/volume data and news articles, including CryptoCompare API setup and News API configuration
Appendix B: Technical Requirements
Software requirements, Python libraries (requests, pandas, scipy, cvxpy, vaderSentiment, nltk, textblob), and development environment setup for both structured and unstructured data processing
Appendix C: Sample Code Structure
Recommended code organization and file structure for cryptocurrency portfolio optimiza- tion workflow with sentiment analysis integration
Appendix D: Sentiment Analysis Resources
VADER sentiment analysis documentation, cryptocurrency-specific sentiment lexicons, and text processing best practices