Ask Gio, Renee Two analysts on Reddit for feedback geometry: - top=30mm - left=30mm - right=30mm - bottom=30mm -------------------------------------------------------------------- # Introductory Data Science for Crime Analysis with Python [Intro Book 1] Front - My background - Motivation to write the book - Why python? - Content overview - how to code, cookbook examples - how to set up a project - More advanced data science components to come Upfront Work - downloading python (Anaconda/Miniconda) - running the REPL/Writing code in Notepad++/VSCode/PyCharm - running code via `python hello.py` - working with github - git pull - git commit - working with command line - cd - mkdir Getting started with python - objects - strings/numbers/lists/dictionaries/booleans - objects are mutable - if/then statements, while statement - ???sets??? Working with strings - f strings - string template - string replacement - regex - splitting strings - ''.join(list) - upper, lower, strip, swapcase - zfill ?Working with iterable objects? - loops - range - enumerate - len() - don't loop if you don't need to - mean, min, max, count - zip - zip takes smallest element - list comprehensions - dict keys, values - while loops - itertools permutations/combinations - sets Libraries - defining your own functions - tuple returning - create your own function in separate file, import it - edit function, importlib reload - library renames - objects have methods, help(function), dir(object) [Not a seperate chapter, in libraries chapter do a call out] Creating Environments - create a new conda environment - install a new library [retenmod] - run script with that new library Working with data in python - pandas/numpy - SQL queries - reading in CSV/Excel - Pandas table notes - selecting rows - merging data - aggregation - pivot tables Working with SQL - creating your own databases - sqllite, duckdb, access - using pyodbc - using sqlalchemy - Rundown of SQL - WHERE - GROUP BY / HAVING - FUNCTIONS (COALESCE, UPPER, whatever, CONCAT) - JOINS (LEFT, INNER, CROSS) - UNION - WITH [CTEs], subqueries - IN - OVER/RANK/NTILES - working with dates - persisting a query using a VIEW [vs stored procedure] - Making graphs with matplotlib / seaborn - basic line graph - bar graph - scatterplots - histograms/KDE plots - small multiples - templates - ?Maps? Using Jupyter to Create Reports - Jupyter Notebooks basics - Making nice tables - nbconvert once completed - quarto for more advanced Project Organization - simple project structure - README - creating environment - functions in src / main script in root - docstrings for functions - ?Github? Automating workflows - running scripts from command line - scheduling bat files - bash output - error handling - sending email notifications - ?running a server? -------------------------------------------------------------------- # Advanced Project Management for Crime Analysis with Python [Book #2] ## Part 1, Creating your own Project Starting your project - root folder - organization of sub-folders - readme - using github Virtual Environments - conda environments - requirements.txt - python versioning, major/minor ???Creating Python library - making wheel, setup.py - saving data files in wheel - installing -e - ?documentation//sphinx? - uploading to pypi ?Advanced Programming Topics? - object oriented - memoization local - currying functions - vectorized / looping - databases Advanced Integrity Tools - black/flake - pytest - github actions ????Advanced Databases/SQL/Docker????? ## Part 2, Example applications Static Quarterly Report - Using Open Data - Compstat report compile to PDF Interactive Dashboard using Dash - example dashboard for call response times - WASM vs local server Flask Application Chronic Offender Cards - Machine learning model, chronic offenders - Pulls up most recent hot list, offender cards Flask Application API call - Chronic Offenders, weighted harm - user has to input information, get back json - deploy to AWS Lambda Webscraping - scrape a webpage for user comments - APIs - HTML parsing with BeautifulSoup - ???Selenium example?? # Advanced Data Science Topics for Crime Analysis with Python [Book #3] ?Poisson? Mapping - geocoding - Weisburd's Law - creating hot spot maps - creating clusters - background maps - Folium - SPPT Social Network Analysis - Graphs - centrality - prioritize focused deterrence Linear Programming for Resource Allocation - balance patrol areas - fair hot spots - assigning detective cases - set cover assign social network Regression Modelling - Statsmodels - formulas - example linear regression - example Negative Binomial - example quantile regression - ???non-linear modelling???? - ???bayesian modelling???? Statistical Modelling for Binary End Points - Chronic Offender lists - Random Forests / Logistic Regression - predictive model - Survival analysis Statistical Modelling for Spatial Predictions - Machine learning for long term predictions - Deep Learning for Short term predictions Time Series Analysis - Seasonal decomposition - ARIMA models - Hierarchical binomial models - Poisson models - cusum charts NLP applications - Named Entity Resolution - Vector embedding/similarity search [Modus Operandi] - Supervised learning examples [bad text example] Program Analysis - WDD - Synthetic Control / Matching estimators - Binary outcomes/Power analysis - Spatial Diffusion --------------------------------------------------- ## Machine Learning Applications in Crime Analysis with Python [Book #4] Regression and the problem with overfitting - need out of sample validation - regularization - non-linear effects via splines Random Forests and Boosting - example decision tree - bad vs linear regression - example multiple trees + bootstrapping - step up to boosting Deep Learning - example hidden layers - example LSTM - example attention network Encoding High Cardinality Features - ordinal encoding - weight of evidence - dirty_cat, embedding, TF/IDF Natural Language Processing - simple transformers/Hugging face - example text classification - compare CatBoost text features vs simple transformers vs combo models Image Processing - pre-trained models???? - google vision API???? - training own model to identify guns/something else?????