RITCH - Parser for the ITCH Protocol
Efficiently parses, filters, and writes binary ITCH files (Version 5.0) containing detailed financial transactions as distributed by NASDAQ to a data.table. Includes functions to interact with NASDAQ data services at <https://emi.nasdaq.com/ITCH/> and <https://emi.nasdaq.com/ITCH/Stock_Locate_Codes/>.
Last updated
cppitchrdatatablecpp
4.56 score 20 stars 18 scripts 487 downloadsdataverifyr - A Lightweight, Flexible, and Fast Data Validation Package that Can Handle All Sizes of Data
Allows you to define rules which can be used to verify a given dataset. The package acts as a thin wrapper around more powerful data packages such as 'dplyr', 'data.table', 'arrow', and 'DBI' ('SQL'), which do the heavy lifting.
Last updated
verification
4.46 score 29 stars 6 scripts 594 downloadsrtiktoken - A Byte-Pair-Encoding (BPE) Tokenizer for OpenAI's Large Language Models
A thin wrapper around the tiktoken-rs crate, allowing to encode text into Byte-Pair-Encoding (BPE) tokens and decode tokens back to text. This is useful to understand how Large Language Models (LLMs) perceive text.
Last updated
bpeopenairusttokenizationcargo
4.15 score 14 stars 7 scripts 156 downloadsrbm25 - A Light Wrapper Around the 'BM25' 'Rust' Crate for Okapi BM25 Text Search
BM25 is a ranking function used by search engines to rank matching documents according to their relevance to a user's search query. This package provides a light wrapper around the 'BM25' 'rust' crate for Okapi BM25 text search. For more information, see Robertson et al. (1994) <https://trec.nist.gov/pubs/trec3/t3_proceedings.html>.
Last updated
bm25rustsimilarity-searchcargo
3.78 score 12 stars 3 scripts 174 downloads