- 5-Level Paging and 5-Level EPT
- 37 Million Compilations: Investigating Novice Programming Mistakes in Large-Scale Student Data
- Unwinding the Stack: Exploring how C++ Exceptions work on Windows
- A Block-sorting Lossless Data Compression Algorithm
- A Brief Introduction to Neural Networks
- A Brief Introduction to the Standard Annotation Language (SAL)
- A Brief Tutorial on Database Queries, Data Mining, and OLAP
- A Case Study in Optimizing HTM-Enabled Dynamic Data Structures: Patricia Tries
- A Catalogue of Optimizing Transformations
- A Comparison of Programming Languages in Economics
- A Comparison of Software and Hardware Techniques for x86 Virtualization
- A comparison of SPDY and HTTP performance
- A Compilation Target for Probabilistic Programming Languages
- A comprehensive study of Convergent and Commutative Replicated Data Types
- A Comprehensive Study of Main-Memory Partitioning and its Application to Large-Scale Comparison- and Radix-Sort
- A Compressed Suffix Tree Based Implementation With Low Peak Memory Usage
- A Course in Machine Learning
- A Crash Course in x86 Assembly for Reverse Engineers
- A Detailed Analysis of the Component Object Model
- A Dynamic Perfect Hash Function Defined by an Extended Hash Indicator Table
- A Family of Perfect Hashing Methods
- A Fast x86 Implementation of Select
- A Fast, Minimal Memory, Consistent Hash Algorithm
- A Faster Cutting Plane Method and its Implications for Combinatorial and Convex Optimization
- A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World
- A few experiments with the Cache Allocation Technology
- A File Comparison Program
- A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications
- A First Encounter with Machine Learning
- A Forensic Analysis of Navy Carrier Strike Group Eleven's Encounter with an Anomalous Aerial Vehicle
- A Framework for Building Extensible C++ Class Libraries
- A History of Modern 64-bit Computing
- A little journey inside Windows memory
- A locality-sensitive hash for real vectors
- A Lock-Free Wait-Free Hash Table
- A Look at Intel's Dataplane Development Kit
- A Malloc Tutorial
- A Mathematical Theory of Communication
- A Mathematician's Lament
- A Method for the Construction of Minimum-Redundancy Codes
- A Nanopass Framework for Compiler Education
- A New Basis for Shifters in General-Purpose Processors for Existing and Advanced Bit Manipulations
- A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
- A NUMA API for Linux
- A Parallel Page Cache: IOPS and Caching for Multicore Systems
- A PlusCal User's Manual: C-Syntax Version 1.8
- A Practical Guide to Support Vector Classification
- A Practical Minimal Perfect Hashing Method
- A Primer on Memory Consistency and Cache Coherence
- A Probabilistic Theory of Deep Learning
- A Proposal for Hardware-Assisted Arithmetic Overflow Detection for Array and Bitfield Operations
- A quick guide to LATEX
- A Relational Model of Data for Large Shared Data Banks
- A Reliable Randomized Algorithm for the Closest-Pair Problem
- A Scalable and Explicit Event Delivery Mechanism for UNIX
- A Scalable Concurrent malloc(3) Implementation for FreeBSD
- A Scalable Lockfree Stack Algorithm
- A Sense of Self for Unix Processes
- A SevenDimensional Analysis of Hashing Methods and its Implications on Query Processing
- A study of code abstraction
- A Study of "Wheat" and "Chaff" in Source Code
- A survey of rollback-recovery protocols in message-passing systems
- Thread Scheduling
- A tool for the linux binaries
- A Truly Concurrent Semantics for the K Framework Based on Graph Transformations
- A Tunable Compression Framework for Bitmap Indices
- A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
- A versatile data structure for edge-oriented graph algorithms
- A very fast substring search algorithm
- A Wavelet Tree Based FM-Index for Biological Sequences in SeqAn
- A Way Forward in Parallelising Dynamic Languages
- Abstract Algebra: Theory and Applications
- Abstract Rendering: Out-of-core Rendering for Information Visualization
- Abusing Mach on Mac OS X
- Accelerating Network Receive Processing
- Adaptive Insertion Policies for High Performance Caching
- Adaptive Ray Packet Reordering
- Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems
- Adding lock elision to Linux
- AddressSanitizer: A Fast Address Sanity Checker
- Advanced Bloom Filter Based Algorithms for Efficient Approximate Data De-Duplication in Streams
- Advanced Data Structures
- Advanced Topics in CUDA
- Advances in Cloud-Scale Machine Learning for Cyber-Defense
- Advances in Memory Management for Windows
- Improving Network Connection Locality on Multicore Systems
- C++ vector class library
- VCL C++ vector class library manual
- Calling conventions for different C++ compilers and operating systems
- Instruction tables
- The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers
- Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms
- Optimizing subroutines in assembly language: An optimization guide for x86 platforms
- Algebra, Topology, Differential Calculus, and Optimization Theory For Computer Science and Machine Learning
- Algorithms for Random 3-SAT
- Algorithms for Routing Lookups and Packet Classification
- Allocation Removal by Partial Evaluation in a Tracing JIT
- Almost random graphs with simple hash functions
- Alpha AXP Architecture
- Alternating Coding and its Decoder Architectures for Unary-Prefixed Codes
- AMD64 Architecture Programmer's Manual, Volume 4: 128-Bit and 256-Bit Media Instructions
- AMD64 Architecture Programmer's Manual, Volume 1: Application Programming
- AMD64 Architecture Programmer's Manual, Volume 2: System Programming
- AMD64 Architecture Programmer's Manual, Volume 3: General-Purpose and System Instructions
- BIOS and Kernel Developer's Guide (BKDG) for AMD Family 15h Models 70h-7Fh Processors
- CPUID Specification
- Graphics Core Next Architecture, Generation 3
- Open-Source Register Reference For AMD Family 17h Processors Models 00h-2Fh
- Preliminary Processor Programming Reference (PPR) for AMD Family 17h Models 00h-0Fh Processors
- Software Optimization Guide for AMD Family 15h Processors
- The Development of Transition-State Theory
- Software Optimization Guide for AMD64 Processors
- An Approach for Minimal Perfect Hash Functions for Very Large Databases
- An elegant algorithm for the construction of suffix arrays
- An Evaluation of Network Stack Parallelization Strategies in Modern Operating Systems
- An experimental exploration of Marsaglia's xorshift generators, scrambled
- An In-Depth Analysis of Disassembly on Full-Scale x86/x64 Binaries
- An Informal Analysis of Perfect Hash Function Search
- An Introduction to Computational Networks and the Computational Network Toolkit
- An Introduction to Statistical Learning with Applications in R
- An NUMA API for Linux
- An optimal algorithm for generating minimal perfect hash functions
- An Overview of Kernel Lock Improvements
- Analysing the Performance of GPU Hash Tables for State Space Exploration
- Analysis of B-tree data structure and its usage in computer forensics
- Analysis of GS protections in Microsoft Windows Vista
- Analyzing Contextual Bias of Program Execution on Modern CPUs
- Analyzing General-Purpose Computing Performance on GPU
- Analyzing GPU Pipeline Latency
- Analyzing Runtime and Size Complexity of Integer Programs
- Analyzing your game performance using Event Tracing for Windows
- Anatomy of High-Performance Matrix Multiplication
- Answering reachability queries on large directed graphs
- The "Ultimate" Anti-Debugging Reference
- Applications of Finite Automata Representing Large Vocabularies
- Applications of finite geometry in coding theory and cryptography
- Applying The Proactor Pattern To High-Performance Web Servers
- Approximate Hypergraph Partitioning and Applications
- Architectural Support for SWAR Text Processing with Parallel Bit Streams: The Inductive Doubling Principle
- Architecture of a Database System
- ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging
- ARM and Thumb-2 Instruction Set Quick Reference Card
- Array Layouts for Comparison-Based Searching
- Array programming with NumPy
- The Art of Assembly Language
- Asim: A Performance Model Framework
- ASLR on the Line: Practical Cache Attacks on the MMU
- Aspects Related to Data Access and Transfer in CUDA
- Assembly Language for Beginners
- Assessing the Relationship between Software Assertions and Code Quality: An Empirical Investigation
- Assessment of Windows Vista Kernel-Mode Security
- Asynchronous Teams: Cooperation Schemes for Autonomous Agents
- ATOM: A System for Building Customized Program Analysis Tools
- Attacking the Windows Kernel
- Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures
- Automatic self-allocating threads (ASAT) on an SGI Challenge
- Automatically Proving Termination and Memory Safety for Programs with Pointer Arithmetic
- Hyperparameter Optimization
- Avoiding AVX-SSE Transition Penalties
- Background, Motivation, and a Retrospective View of the BLAS
- Backward search fm-index (full-text index in minute space)
- Balanced Families of Perfect Hash Functions and Their Applications
- Bash Redirections Cheat Sheet
- Basic Linear Algebra Subprograms for Fortran Usage
- Basics of Compiler Design
- Battle of SKM and IUM - How Windows 10 Rewrites OS Architecture
- Hardware Breakpoint (or watchpoint) usage in Linux Kernel
- Bayesian Reasoning and Machine Learning
- Benchmarking a B-tree compression method
- Benefits of I/O Acceleration Technology (I/OAT) in Clusters
- Best Practices for Gathering Optimizer Statistics with Oracle Database 12c
- Best Practices for Vectorization
- Better bitmap performance with Roaring bitmaps
- Better Performance at Lower Occupancy
- Better with fewer bits: Improving the performance of cardinality estimation of large data streams
- Beyond Block I/O: Rethinking Traditional Storage Primitives
- BGP in 2013 (and a bit of 2014)
- Big Data: New Tricks for Econometrics
- An Inside Look at Google BigQuery
- Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or −1
- Binary Coding
- Binary Combinatorial Coding
- Binary search tree with SIMD bandwidth optimization using SSE
- BIOS and Kernel Developer's Guide for AMD Athlon 64 and AMD Opteron Processors
- Bit Operations
- Bitcoin: A Peer-to-Peer Electronic Cash System
- Bitmap Graphics SIGGRAPH'84 Course Notes
- A New System of Alternating Current Motors and Transformers
- Bitmap Index Design and Evaluation
- Bitmap Index Design Choices and Their Performance Implications
- Bitmap Indexing and related indexing techniques
- Bitmap Indices for Data Warehouses
- BitPath – Label Order Constrained Reachability Queries over Large Graphs
- On the Influence of Carbonic Acid in the Air upon the Temperature of the Ground
- Blade: A Data Center Garbage Collector
- BLAKE2: simpler, smaller, fast as MD5
- Blogel: A BlockCentric Framework for Distributed Computation on RealWorld Graphs
- Boosting Vector Calculus with the Graphical Notation
- Bounds Checking on GPU
- BPF – in-kernel virtual machine
- Branch and Data Herding: Reducing Control and Memory Divergence for Error-tolerant GPU Applications
- Branch Prediction and the Performance of Interpreters - Don't Trust Folklore
- Branch Prediction with Neural Networks: Hidden layers and Recurrent Connections
- Brief Calculus
- Bringing SIMD-128 to JavaScript
- Broadword Implementation of Parenthesis Queries
- Broadword Implementation of Rank/Select Queries
- Brook for GPUs: Stream Computing on Graphics Hardware
- B-trees, Shadowing, and Clones
- Bugs as Deviant Behavior: A General Approach to Inferring Errors in Systems Code
- Build Systems à la Carte
- Building a Bw-Tree Takes More Than Just BuzzWords
- Building R Packages: An Introduction
- Bumper Sticker Computer Science: programming pearls
- Burrows-Wheeler Transform and FM Index
- Bypass Control Flow Guard Comprehensively
- C Reference Cheat Sheet
- Introduction to DevOps on AWS
- Working Draft, Standard for Programming Language C++
- Language Features of C++17
- C++20 Features
- Cache and I/O Efficient Functional Algorithms
- CAB: Cache Aware Bi-tier Task-stealing in Multi-socket Multi-core Architecture
- Cache Organization and Memory Management of the Intel Nehalem Computer Architecture
- Cache-, Hash- and Space-Efficient Bloom Filters
- Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency
- Cache-Oblivious Algorithms and Data Structures
- Cache-Oblivious Peeling of Random Hypergraphs
- Cache-Oblivious Streaming B-trees
- CAF - The C++ Actor Framework for Scalable and Resource-efficient Applications
- Calculus Made Easy
- Calculus Refresher, version 2008.4
- Canopy: An End-to-End Performance Tracing And Analysis System
- Can't Get To Performing Without Storming
- Captain Hook: Pirating AVs to Bypass Exploit Mitigations
- CASEVision/ClearCase Administration Guide
- Checking system rules using system-specific, programmer-written compiler extensions
- Chihuahua: A Concurrent, Moving, Garbage Collector using Transactional Memory
- Chinese remainder theorem and its applications
- Chart Suggestions — A Thought-Starter
- Chord: A Scalable Peertopeer Lookup Service for Internet Applications
- CityHash: Fast Hash Functions for Strings
- CK-12 Probability and Statistics - Advanced
- Rational ClearCase Administrator's Guide
- Cluster based Mixed Coding Schemes for Inverted File Index Compression
- Codes for the positive integers
- Cognitive Biases Potentially Affecting Judgment of Global Risks
- Programming And Optimization For Intel Architecture: One-Day Work Shop
- Optimization Techniques for the Intel MIC Architecture. Part 1 of 3: Multi-Threading and Parallel Reduction
- Optimization Techniques for the Intel MIC Architecture. Part 2 of 3: Strip-Mining for Vectorization
- Optimization Techniques for the Intel MIC Architecture. Part 3 of 3: False Sharing and Padding
- Comdb2: Bloomberg's Highly Available Relational Database System
- Communication Efficient Distributed Machine Learning with the Parameter Server
- Comparative Performance of Memory Reclamation Strategies for Lock-free and Concurrently-readable Data Structures
- Competitive Programmer's Handbook
- Calling conventions for different C++ compilers and operating systems
- Compiler Confidential
- Compiler Construction
- Compiler Construction - The Art of Niklaus Wirth
- Compiler Design: Theory, Tools, and Examples
- Compiler Design In C
- Compiler Internals: Exceptions and RTTI
- Compiling Python Modules to Native Parallel Modules Using Pythran and OpenMP Annotations
- Component Object Model: An Overview and Practical Implementation
- Compressed Bloom Filters
- Inverted Indexes: Compressed Inverted Indexes
- Compressed Perfect Embedded Skip Lists for Quick Inverted-Index Lookups
- compression
- Computer Systems Research: Past and Future
- Concurrent Hash Tables: Fast and General(?)!
- Concurrent Programming for Scalable Web Architectures
- Concurrent Reference Counting and Resource Management in Constant Time
- Conflict-Free Vectorization of Associative Irregular Applications with Recent SIMD Architectural Advances
- Consistently faster and smaller compressed bitmaps with Roaring
- Constraint Propagation Algorithms for Temporal Reasoning: A Revised Report
- Convex Optimization
- Cooperative Kernels: GPU Multitasking for Blocking Algorithms
- Coq: The world's best macro assembler?
- Cores of Random r-Partite Hypergraphs
- CORFU: A Shared Log Design for Flash Clusters
- COZ: Finding Code that Counts with Causal Profiling
- Cache Memory
- Creating R Packages: A Tutorial
- Critique Of Microkernel Architectures
- Cuckoo Filter: Practically Better Than Bloom
- Cuckoo++ Hash Tables: High-Performance Hash Tables for Networking Applications
- CUDA C Programming Quick Reference
- CUDA Asynchronous Memory Usage and Execution
- CUDA C/C++ Streams and Concurrency
- CUDA C Programming Guide
- CUDA Debugging With Command Line Tools
- CUDA Unified Memory
- CUDA Optimizations
- CUDA Streams: Best Practices And Common Pitfalls
- CUDA Thread Basics
- CUDA Thread Indexing Cheatsheet
- Graphics Processing Units (GPUs): Architecture and Programming
- Unified memory - GPGPU 2015: High Performance Computing with CUDA
- CSE 591: GPU Programming Using CUDA in Practice
- CUDAsmith: A Fuzzer for CUDA Compilers
- Curves and Surfaces: Lecture Notes for Geometry 1
- Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
- Lecture 1: Shannon's Theorem
- Lecture 2: Morse Code to Huffman Coding
- Data Compression Techniques-Lecture 3: Integer Codes I
- Data Compression Techniques- Lecture 4: Integer Codes II
- Lecture 5: Adaptive Prefix-Free Coding
- Lecture 6: Arithmetic Coding
- Data Compression Techniques- Lecture 7: Dictionary Compression
- Data Structures and Algorithms: Annotated Reference with Examples
- Data Structures for Text Sequences
- Data Transfer Matters for GPU Computing
- Database Fundamentals
- Database System Implementation
- Overview of Amazon Web Services
- Collaborative Filtering: I like what you like
- Collaborative Filtering: Implicit ratings and item based filtering
- Classification based on item attributes
- Evaluating algorithms and kNN
- Naive Bayes
- Classifying unstructured text
- Data-Parallel Hashing Techniques for GPU Architectures
- Debugging Programs that use Atomic Blocks and Transactional Memory
- Getting Started with WinDbg (User-Mode)
- Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU
- DEC: The mistakes that led to its downfall
- Deep Learning Tutorial
- DeepState: Symbolic Unit Testing for C and C++
- Demystifying DAS, SAN, NAS, NAS Gateways, Fibre Channel, and iSCSI
- Demystifying GPU Microarchitecture through Microbenchmarking
- Deny Capabilities for Safe, Fast Actors
- Depth-first search and linear graph algorithms
- Derivability, Redundancy and Consistency of Relations Stored in Large Data Banks
- Designing COM Interfaces
- Deterministic Dynamic Deadlock Detection and Recovery
- Detours: Binary Interception of Win32 Functions
- Developing and Porting C and C++ Applications on AIX
- DIGITAL FX!32 Running 32-Bit x86 Applications on Alpha NT
- Dijkstra's in Disguise
- DI-MMAP: A High Performance Memory-Map Runtime for Data-Intensive Applications
- Direct Cache Access for High Bandwidth Network I/O
- Disk Based Hash Tables and Quantified Numbers
- Disruptor: High performance alternative to bounded queues for exchanging data between concurrent threads
- Dissecting the NVidia Turing T4 GPU via Microbenchmarking
- Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking
- Distributed and parallel time series feature extraction for industrial big data applications
- Distributed Component Object Model (DCOM) Remote Protocol
- Dodd-Frank Act Stress Test 2014: Supervisory Stress Test Methodology and Results
- Down for the Count? Getting Reference Counting Back in the Ring
- Draw me a Local Kernel Debugger
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting
- DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X, and FreeBSD
- Dueling UNIXes and the UNIX Wars
- Dynamic Storage Allocation: A Survey and Critical Review
- Effective Computation of Biased Quantiles over Data Streams
- Efficient Algorithms for Large-Scale Image Analysis
- Efficient Computation of Binomial Coefficients Using Splay Trees
- Efficient Estimation of Mutual Information for Strongly Dependent Variables
- Efficient Estimation of Word Representations in Vector Space
- Efficient Exploitation of Parallelism on Pentium III and Pentium 4 Processor-Based Systems
- Efficient Hash Probes on Modern Processors
- Efficient Hashing with Lookups in two Memory Accesses
- Efficient implementation of lazy suffix trees
- Efficient Implementation of Reductions on GPU Architectures
- Black Holes and Entropy
- Efficient Implementation of Sorting on MultiCore SIMD CPU Architecture
- Efficient Lightweight Compression Alongside Fast Scans
- Efficient Lossless Compression of Trees and Graphs
- Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
- Efficient string matching: An aid to bibliographic search
- Efficient Virtual Memory for Big Memory Servers
- Efficiently Compiling Efficient Query Plans for Modern Hardware
- Egocentrism Over E-Mail: Can We Communicate as Well as We Think?
- E = I + T: The internal extent formula for compacted tries
- Elementary Calculus: An Infinitesimal Approach
- Executable and Linkable Format (ELF)
- ELF Handling for Thread-Local Storage
- Eliminating Global Interpreter Locks in Ruby through Hardware Transactional Memory
- Empirical Study of the Anatomy of Modern Sat Solvers
- Encyclopedia of Controller Fundamentals and Features - Firmware
- Engineering Better Software at Microsoft
- Enhancing Server Availability and Security Through Failure-Oblivious Computing
- A Machine-learning Method To Explore The UEFI Landscape
- Establishing a Base of Trust with Performance Counters for Enterprise Workloads
- Estimating Flight Characteristics of Anomalous Unidentified Aerial Vehicles
- Hooking Nirvana: Stealthy Instrumentation Techniques
- Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models
- Evaluation of Parallel Design Patterns for Message Processing Systems on Embedded Multicore Systems
- Evaluation of Rolling Sphere Method Using Leader Potential Concept: A Case Study
- EventSource User's Guide
- EventSource Activity Support
- Everything we know about CRC but afraid to forget
- Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask
- Exact minimum degree thresholds for perfect matchings in uniform hypergraphs
- Experiences in the Land of Virtual Abstractions
- Experiences Porting Real Time Signal Processing Pipeline CUDA Kernels to Kepler and Windows 8
- Expert programmers have fine-tuned cortical representations of source code
- Explaining AdaBoost
- EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors
- Exploiting Coarse-Grain Speculative Parallelism
- Exploiting deferred destruction: An analysis of read-copy-update techniques in operating system kernels
- Exploiting SIMD for Complex Numerical Predicates
- Exploring Control Flow Guard in Windows 10
- Exploring PL/SQL New Features and Best Practices for Better Performance
- Exponential Golomb and Rice Error Correction Codes for Generalized Near-Capacity Joint Source and Channel Coding
- Extending Oracle E-Business Suite Release 12.1 and above using Oracle Application Express
- Extending Python for High- Performance Data-Parallel Programming
- External Perfect Hashing for Very Large Key Sets
- The Pilot's Operating Handbook
- Failure-Atomic msync(): A Simple and Efficient Mechanism for Preserving the Integrity of Durable Data
- Fallout: Reading Kernel Writes From User Space
- FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs
- Fast and scalable minimal perfect hashing for massive key sets
- Fast And Space Efficient Trie Searches
- Fast Bit Compression and Expansion with Parallel Extract and Parallel Deposit Instructions
- Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
- Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction
- Fast Databases with Fast Durability and Recovery Through Multicore Parallelism
- Fast Deterministic Selection
- Fast Exact Multiplication by the Hessian
- Fast keyed hash/pseudo-random function using SIMD multiply and permute
- Fast Multiple String Matching Using Streaming SIMD Extensions Technology
- Fast Packed String Matching for Short Patterns
- Fast Parallel GPU-Sorting Using a Hybrid Algorithm
- Fast Parallel Suffix Array on the GPU
- Fast Prefix Search in Little Space, with Applications
- Fast Quicksort Implementation Using AVX Instructions
- Fast Scalable Construction of (Minimal Perfect Hash) Functions
- Fast Search in Hamming Space with Multi-Index Hashing
- Fast Sort on CPUs, GPUs and Intel MIC Architectures
- Fast SortedSet Intersection using SIMD Instructions
- Fast Sorting Algorithms using AVX-512 on Intel Knights Landing
- Fast Splittable Pseudorandom Number Generators
- Fast String Correction with Levenshtein-Automata
- FastBDT: A speed-optimized and cache-friendly implementation of stochastic gradient-boosted decision trees for multivariate classification
- Faster 64-bit universal hashing using carry-less multiplications
- Faster Base64 Encoding and Decoding using AVX2 Instructions
- Faster Population Counts using AVX2 Instructions
- Featherweight Threads for Communication
- FERRARI: Flexible and Efficient Reachability Range Assignment for Graph Indexing
- Fibers under the magnifying glass
- Fibre Channel Fundamentals
- Filter Manager
- The Salomon Smith Barney Introductory Guide to Equity Options
- Finding Frequent Items in Data Streams
- Finding Minimal Perfect Hash Functions
- Finding Similar Items
- Finding small balanced separators
- FLASHRELATE: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples
- FLUSH+RELOAD: a High Resolution, Low Noise, L3 Cache Side-Channel Attack
- Folding and Unfolding
- Foreign Library Interface
- Foundations of Data Science
- Foundations of Databases
- FPGA Acceleration by Dynamically-Loaded Hardware Libraries
- Fractal Prefetching B+ -Trees: Optimizing Both Cache and Disk Performance
- Framework for Instruction-level Tracing and Analysis of Program Executions
- Free Launch: Optimizing GPU Dynamic Kernel Launches through Thread Reuse
- From Numerical Cosmology to Efficient Bit Abstractions for the Standard Library
- Fully Concurrent Garbage Collection of Actors on Many-Core Machines
- Fundamentals of Calculus
- Fundamentals of Deep Learning of Representations
- Fundamentals of Learning
- Further scramblings of Marsaglia's xorshift generators
- Futexes Are Tricky
- General Analysis of Maxima and Minima in Constrained Optimization Problems
- General Incremental Sliding Window Aggregation
- Generalized Golomb Codes and Adaptive Coding of Wavelet-Transformed Image Subbands
- Generalized Histogram Algorithms for CUDA GPUs
- Generating Sequences With Recurrent Neural Networks
- Generating Text with Recurrent Neural Networks
- Getting Physical Extreme abuse of Intel based Paging Systems
- Getting Started with CUDA
- Getting Started with Software Tracing in Windows Drivers
- Git from the bottom up
- Git Magic
- Go 1.5 concurrent garbage collector pacing
- Goals Gone Wild: The Systematic Side Effects of Over-Prescribing Goal Setting
- Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
- GPERF: A Perfect Hash Function Generator
- Lecture 3: control flow and synchronisation
- GPU programming basics
- GPU-ArraySort: A parallel, in-place algorithm for sorting large number of arrays
- Does the Inertia of a Body Depend upon its Energy-Content?
- Graph theoretic obstacles to perfect hashing
- Graph Theory
- GraphBLAS Mathematics
- Graphs, Hypergraphs and Hashing
- GRIM: Leveraging GPUs for Kernel Integrity Monitoring
- Particle Creation by Black Holes
- Guide to Automatic Vectorization with Intel AVX-512 Instructions in Knights Landing Processors
- Gunrock: A High-Performance Graph Processing Library on the GPU
- H2O - the optimized HTTP server
- Hardware Acceleration for Memory to Memory Copies
- IMP: Indirect Memory Prefetcher
- Hardware is the new software
- Hardware Transactional Memory on Haswell
- Hardware-Aware Optimization: Using Intel Streaming SIMD Extensions
- HARE: Hardware Accelerator for Regular Expressions
- Harnessing Intel Processor Trace on Windows for Vulnerability Discovery
- Hash and Displace: Efficient Evaluation of Minimal Perfect Hash Functions
- Hash Tables
- Hash, displace, and compress
- Hashcash - A Denial of Service Counter-Measure
- HASHI: An Application-Specific Instruction Set Extension for Hashing
- Haskell vs. F# vs. Scala: A High-level Language Features and Parallelism Support Comparison
- Haswell block diagram
- HAT-trie: A Cache-conscious Trie-based Data Structure for Strings
- Taming parallel I/O complexity with auto-tuning
- Heapy: A Memory Profiler and Debugger for Python
- HELIX-RC: An Architecture-Compiler Co-Design for Automatic Parallelization of Irregular Programs
- Heracles: Improving Resource Efficiency at Scale
- HexRaysCodeXplorer: object oriented RE for fun and profit
- Hidden Markov Model
- High Performance Histograms on SIMT and SIMD Architectures
- High Performance I/O with NUMA Systems in Linux
- High Speed Hashing for Integers and Strings
- High Throughput Heavy Hitter Aggregation for Modern SIMD Processors
- High-Performance Concurrency Control Mechanisms for Main-Memory Databases
- Histograms: Privatized for Fast, Level Performance
- Hoard: A Scalable Memory Allocator for Multithreaded Applications
- How does a GPU shader core work?
- How fast can we make interpreted Python?
- How Microsoft Builds Software
- How NOT to Measure Latency
- How the VAX Lost Its POLY (and EMOD and ACB_floating too)
- How to Benchmark Code Execution Times on Intel IA-32 and IA-64 Instruction Set Architectures
- How To Code In HTML5 And CSS3
- How To Overcome the GIL Limitations (While Staying In Python Ecosphere)
- How to Partition a Billion-Node Graph
- How to Read a Paper
- How To Test 10 Gigabit Ethernet Performance
- How To Use Event Tracing For Windows For Performance Analysis
- How to Write Fast Code
- How to Write Fast Numerical Code
- How to Write Shared Libraries
- How TokuDB Fractal Tree IndexesWork
- HTTP as the NarrowWaist of the Future Internet
- Multimedia Communications
- HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
- Hyperedge Replacement Graph Grammars
- CS 6824: Hypergraph Algorithms and Applications
- Hypergraph-Based Anomaly Detection in Very Large Networks
- Hypergraphs: Algorithms, Implementations, and Applications
- HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
- How to Program in C++ With 100 Examples
- I Got 99 Problem But a Kernel Pointer Ain't One
- iBFS: Concurrent Breadth-First Search on GPUs
- IBM DB2 for i indexing methods and strategies
- Basic SAN Configuration Setup Guide
- IDA Plug-in Writing in C/C++
- Ideal Hash Trees
- NVM Express and the PCI Express SSD Revolution
- IEEE 802.3ad Link Aggregation (LAG)
- Binary Encoding and Quantization
- Implementing Algebraic Effects in C "Monads for Free in C"
- Implementing Sorting in Database Systems
- Improved Bounds For Covering Complete Uniform Hypergraphs
- Improved Fast Similarity Search in Dictionaries
- Technical Report: Improvement of Fitch function for Maximum Parsimony in Phylogenetic Reconstruction with Intel AVX2 assembler instructions
- Improving Automated Analysis of Windows x64 Binaries
- Improving compiler optimizations using machine learning
- Improving Python's Memory Allocator
- Improving Real-Time Performance with CUDA Persistent Threads (CuPer) on the Jetson TX2
- Improving the speed of neural networks on CPUs
- Incremental Construction of Minimal Acyclic Finite State Automata and Transducers
- Index Search Algorithms for Databases and Modern CPUs
- Induced subgraphs of hypercubes and a proof of the Sensitivity Conjecture
- Infinite-Alphabet Prefix Codes Optimal for β-Exponential Penalties
- Information Retrieval
- Information Theory for Intelligent People
- Initial End-to-End Performance Evaluation of 10-Gigabit Ethernet
- InK-Compact: In-Kernel Stream Compaction and Its Application to Multi-Kernel Data Visualization on General-Purpose GPUs
- Inline Function Expansion for Compiling C Programs
- In-Memory Columnar Store for PostgreSQL
- Jenkins CheatSheet
- Inside the deal that made Bill Gates $350,000,000
- Inside the Python GIL
- Instant Loading for Main Memory Databases
- Instant Loading for Main Memory Databases
- Integer encoding
- _vectorcall and __regcall Demystified
- A Novel Hashing Method suitable for Lookup Functions
- Intel Architecture: Instruction Set Extensions Programming Reference
- Avoiding AVX-SSE Transition Penalties
- Improving Real-Time Performance by Utilizing Cache Allocation Technology
- A literature survey on Machine Learning Algorithms
- Mitigations for Jump Conditional Code Erratum
- Performance Monitoring Unit Sharing Guide
- Intel AVX-512 Architecture Insights
- 5th Generation Intel Core Processor Family Specification Update
- Intel 64 and IA-32 Architectures Optimization Reference Manual
- Intel 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C and 3D
- Intel Advanced Encryption Standard (AES) New Instructions Set
- Kubernetes for Full-Stack Developers
- Intel Architecture Instruction Set Extensions and Future Features Programming Reference
- Intel Architecture Code Analyzer
- Intel Architecture Instruction Set Extensions Programming Reference
- Intel Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode
- Performance Analysis
- Chemical Kinetics and the Origins of Physical Chemistry
- Intel Ethernet Controller 82571EB /82572EI/ 82571GB/ 82571GI Specification Update
- Intel I/O Acceleration Technology (Intel IOAT) Overview
- Intel Multimedia Instructions (MMX, SSE, SSE2, SSE3, SSSE3 and SSE4)
- PCI Express* Ethernet Networking
- The Origin and Status of the Arrhenius Equation
- Introduction to C#
- On the Theory of the Energy Distribution Law of the Normal Spectrum
- Intel SIMD architecture
- The Story Of Intel MMX Technology
- Intel Xeon Processor E5 Product Family
- Intel Xeon Phi Coprocessor System Software Developers Guide
- Intel Xeon Phi Coprocessor
- Intel Xeon Processor E5 v2 and E7 v2 Product Families Uncore Performance Monitoring Reference Manual
- Intel Xeon Processor E7 Family Uncore Performance Monitoring Programming Guide
- Intel Xeon Scalable Processor
- Interrupts in Linux
- Interval hash tree: An eÆcient index structure for searching object queries in large image databases
- Introduction to AMD GPU programming with HIP
- Introduction to Coccinelle
- Introduction to Debugging the FreeBSD Kernel
- Introduction to DPDK
- Introduction to Dynamic Unary Encoding
- Introduction to GPUs
- Introduction to Intel Ethernet Flow Director and Memcached Performance
- Introduction to Machine Learning CMU-10701: Deep Learning
- Introduction to Mathematics for Game Development
- Introduction to Parallel Architectures
- Introduction to Probability and Statistics Using R
- Introduction to Python for Computational Science and Engineering (A beginner's guide)
- Introduction to Random Graphs
- Introduction to the Pin Instrumentation Tool
- Introduction to x64 Assembly
- Introspection for C and its Applications to Library Robustness
- Investigation of Hardware Transactional Memory
- I/O Is Faster Than the CPU – Let's Partition Resources and Eliminate (Most) OS Abstractions
- IRON File Systems
- iSAX 2.0: Indexing and Mining One Billion Time Series
- ispc: A SPMD Compiler for High-Performance CPU Programming
- ItemBased Collaborative Filtering Recommendation Algorithms
- It's Time for Low Latency
- On an Improvement of Wien's Equation for the Spectrum
- ECMAScript Language Specification
- On the Law of Distribution of Energy in the Normal Spectrum
- Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services
- Joint Strike Fighter Air Vehicle C++ Coding Standards For The System Development And Demonstration Program
- Jump Over ASLR: Attacking Branch Predictors to Bypass ASLR
- Jump the Queue to Lower Latency
- K: A Rewriting-Based Framework for Computations
- Kam1n0: MapReduce-based Assembly Clone Search for Reverse Engineering
- k-Ary Search on Modern Processors
- KASLR is Dead: Long Live KASLR
- Keccak and the SHA-3 Standardization
- Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels
- Kernel Debugging with WinDbg
- A Generalized Theory of Plasticity Involving the Virial Theorem
- Kernel Pool Exploitation on Windows 7
- Kernel-Mode Driver Architecture Design Guide
- KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism
- KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs
- Rocket Propulsion Elements
- KNOW YOUR HTTP STATUS CODES!
- Latency and Bandwidth Impact on GPU-systems
- LATEX for Computer Scientists
- Lazy and Speculative Execution
- Lazy Asynchronous I/O For Event-Driven Servers
- Learning a Hidden Hypergraph
- Learning statistics with R: A tutorial for psychology students and other beginners
- Learning with Hypergraphs: Clustering, Classification, and Embedding
- Lecture 11: Programming on GPUs (Part 1)
- Lecture Notes on AVL Trees
- Lecture Notes on Linear Algebra
- Less Hashing, Same Performance: Building a Better Bloom Filter
- Let your GPU do the heavy lifting in your data Warehouse
- Leveraging Compression in In-Memory Databases
- libtorque: Portable Multithreaded Continuations for Scalable Event-Driven Programs
- Lightweight Contention Management for Efficient Compare-and-Swap Operations
- Linear Algebra Abridged
- Linear Algebra
- Linear Road: A Stream Data Management Benchmark
- Linked List Problems
- B-trees, Shadowing, and Clones
- Linux Block IO: Introducing Multi-queue SSD Access on Multi-core Systems
- Linux Kernel architecture for device drivers
- Linux Productivity Tools
- Proceedings of the Linux Symposium
- 10 Gbit/s Bi-Directional Routing on standard hardware running Linux
- Disruptor: High performance alternative to bounded queues for exchanging data between concurrent threads
- Locality Exists in Graph Processing: Workload Characterization on an Ivy Bridge Server
- Lock-free Concurrent Data Structures
- Lockless Programming in Games
- Locks, Deadlocks, and Synchronization
- Logistic Regression
- Lonestar: A Suite of Parallel Irregular Programs
- Long gaps between primes
- Longest Common Extension with Recompression
- Long-term Recurrent Convolutional Networks for Visual Recognition and Description
- Loop Independence, Compiler Vectorization and Threading of loops (SSE and AVX)
- Lossless compression in lossy compression systems
- Lossless Source Coding
- Lower Bound Techniques for Data Structures
- LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data
- An insane idea on reference counting
- Lynx: Using OS and Hardware Support for Fast Fine-Grained Inter-Core Communication
- M4: A Visualization Oriented Time Series Data Aggregation
- Mach: A New Kernel Foundation For UNIX Development
- Machine Learning: The High-Interest Credit Card of Technical Debt
- Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources
- Maintaining Knowledge about Temporal Intervals
- Making Lockless Synchronization Fast: Performance Implications of Memory Reclamation
- Making Networking Apps Scream on Windows with DPDK
- Managing the Development of Large Software Systems
- Managing the Google Web 1T 5-gram with Relational Database
- Managing Traffic with ALTQ
- Lecture 7: Markov Chains and Random Walks
- MARX: Uncovering Class Hierarchies in C++ Programs
- Nikola Tesla Complete Articles And Patents
- The Ultimate Guide to Deploy Kubernetes
- Jenkins User Handbook
- The principle of least action in quantum mechanics
- AMD64 Architecture Programmer's Manual, Volume 5: 64-Bit Media and x87 Floating-Point Instructions
- Massively Parallel SortMerge Joins in Main Memory MultiCore Database Systems
- Massively-Parallel Similarity Join, Edge-Isoperimetry, and Distance Correlations on the Hypercube
- Mastering the Game of Go with Deep Neural Networks and Tree Search
- Matchings in 3-uniform hypergraphs
- Matchings in k‐partite k‐uniform hypergraphs
- Matchings, Hamilton cycles and cycle packings in uniform hypergraphs
- Math for Machine Learning
- Mathematics for Computer Science
- Maximally Consistent Sampling and the Jaccard Index of Probability Distributions
- Maximizing File Transfer Performance. Using 10Gb Ethernet and Virtualization
- Maximizing GPU Throughput Across Multiple Streams – Tips and Tricks
- Maximizing Performance of PC Games on 64-bit Platforms
- Measuring the Impact of Event Dispatching and Concurrency Models on Web Server Performance Over High-speed Networks
- MegaPipe: A New Programming Interface for Scalable Network I/O
- On the Electrodynamics of Moving Bodies
- Meltdown
- Memory Barriers: a Hardware View for Software Hackers
- Memory Efficient Hard Real-Time Garbage Collection
- Memory Ordering in Modern Microprocessors
- Memory-Efficient Search Trees for Database Management Systems
- Mental models, Consistency and Programming Aptitude
- MGtoolkit: A python package for implementing metagraphs
- Microsoft Portable Executable and Common Object File Format Specification
- Microsoft Windows Software Development Kit: Programmer's learning Guide
- Microsoft Windows RPC Security Vulnerabilities
- Mihai Patrascu: Obituary and Open Problems
- Minimal Perfect Hash Functions Made Simple
- MIPSpro C and C++ Pragmas
- MIPSpro Assembly Language Programmer's Guide
- Mismorphism: a Semiotic Model of Computer Security Circumvention (Extended Version)
- Mison: A Fast JSON Parser for Data Analytics
- Processing Relational Queries Using a Multidimensional Access Method
- Mixed Model Universal Software Thread-Level Speculation
- Mobile Computing Research Is a Hornet's Nest of Deception and Chicanery
- Modeling high-frequency limit order book dynamics with support vector machines
- Modern C
- Modern Microprocessors - A 90 Minute Guide!
- MonetDB/X100: Hyper-Pipelining Query Execution
- Monotone Minimal Perfect Hashing: Searching a Sorted Table with O(1) Accesses
- More Than You Ever Wanted to Know about Synchronization
- Compound Synchronization Objects
- DLLs the dynamic way
- Emulating Operating System Synchronization
- Multithreading for Rookies
- Multithreading performance
- Writing scalable applications for windows NT
- Writing windows NT server applications in MFC using I/O completion ports
- MultiCore, Main Memory Joins: Sort vs. Hash Revisited
- Multiple Byte Processing with Full- Word Instructions
- Name mangling demystified
- Near-Optimal Space Perfect Hashing Algorithms
- Network stack challenges at increasing speeds
- Network Stack Specialization for Performance
- Networks of Collaborations: Hypergraph Modeling and Visualisation
- Neural Turing Machines
- NeuralWord Embedding as Implicit Matrix Factorization
- New Approach for Graph Algorithms on GPU using CUDA
- New cardinality estimation algorithms for HyperLogLog sketches
- Next Generation Collaborative Reversing with Ida Pro and CollabREate
- Nobody ever got fired for using Hadoop on a cluster
- Nonblocking Algorithms and Scalable Multicore Programming
- Notes on Differential Equations
- Turing Machines
- NTFS Reference Sheet
- NTFS Documentation
- NTFS.sys crash
- Numba Python compiler for NumPy/SciPy
- Binning Data with Python
- NVDIMM Block Window Driver Writer's Guide
- NVDIMM Namespace Specification
- Artificial Intelligence and Robotics
- MS-UEdin Submission to the WMT2018 APE Shared Task: Dual-Source Transformer for Automatic Post-Editing
- NVIDIA's Next Generation CUDA Compute Architecture
- NVL: Implementing persistent memory applications
- NYSE Openbook Ultra Client Specification
- Graphics Processing Units (GPUs): Architecture and Programming - CUDA Advanced Techniques 1
- Graphics Processing Units (GPUs): Architecture and Programming - CUDA Advanced Techniques 2
- Graphics Processing Units (GPUs): Architecture and Programming - CUDA Advanced Techniques 3
- Graphics Processing Units (GPUs): Architecture and Programming - CUDA Advanced Techniques 4
- Forces in Molecules
- Instructions for objconv
- Object-Relative Addressing: Compressed Pointers in 64-Bit Java Virtual Machines
- On End-to-End Program Generation from User Intention by Deep Neural Networks
- On Hamilton cycle decompositions of r-uniform r-partite hypergraphs
- The Quantum Theory of the Electron
- On the data access issue (or why CPUs starving)
- On the de Bruijn–Newman constant
- On the Implementation of Minimum Redundancy Prefix Codes
- On the k-Independence Required by Linear Probing and Minwise Independence
- On the Performance of Bitmap Indices for High Cardinality Attributes
- On the Quest for an Acyclic Graph
- One BillionWord Benchmark for Measuring Progress in Statistical Language Modeling
- "One Size Fits All": An Idea Whose Time Has Come and Gone
- On-the-Fly Garbage Collection: An Exercise in Cooperation
- Open Crypto Audit Project TrueCrypt: Security Assessment
- Open Source Kernel Enhancements for Low-Latency Sockets using Busy Poll
- OpenGIS Implementation Standard for Geographic information - Simple feature access - Part 2: SQL option
- OpenVMS RTL String Manipulation (STR$) Manual
- Opportunistic Data Structures with Applications
- Optimization of Generalized Unary Coding
- Optimizations in C++ Compilers
- Optimizing And Interfacing With Cython
- Simulating Physics with Computers
- Optimizing Indirect Memory References with milk
- Optimizing Parallel Prefix Operations for the Fermi Architecture
- Optimizing Parallel Reduction in CUDA
- Optimizing pattern matching
- Optimizing TLS for High–Bandwidth Applications in FreeBSD
- Beginning Performance Tuning
- Plug into the cloud
- Multitenant Databases
- Oracle 12c Top 20 New Features for Developers
- Partitioning: Tips and Tricks
- Tips and Techniques for Statistics Gathering
- Understanding Oracle Locking Internals
- x86 Assembly Language Reference Manual
- Oracle Database Data Warehousing Guide, 11g Release 2 (11.2)
- Advanced Compression with Oracle Database 11g
- Oracle Database Administrator's Guide, 11g Release 1 (11.1)
- Oracle Database Concepts 11g Release 1 (11.1)
- Oracle Database Data Cartridge Developer's Guide 11g Release 1 (11.1)
- Oracle Database Reference 11g Release 1 (11.1)
- Oracle Database SQL Language Reference 11g Release 1 (11.1)
- Oracle Database Advanced Application Developer's Guide 11g Release 2 (11.2)
- Oracle Text Application Developer's Guide 11g Release 2 (11.2)
- Oracle Database 2 Day + Data Warehousing Guide 11g Release 2 (11.2)
- Oracle Database Object-Relational Developer's Guide 11g Release 2 (11.2)
- Oracle Database Performance Tuning Guide 11g Release 2 (11.2)
- Oracle Database PL/SQL Language Reference 11g Release 2 (11.2)
- Pandoc User's Guide
- Oracle Text Reference 11g Release 2 (11.2)
- Oracle Database VLDB and Partitioning Guide 11g Release 2 (11.2)
- Oracle Database Utilities 12c Release 1 (12.1.0.2)
- Index Internals
- Oracle PLSQL Coding Guidelines
- PL/SQL in Oracle 12c
- Pattern matching in sequences of rows (11)
- Oracle Spatial User Conference
- Oral History of David Cutler
- Order-Preserving Key Compression for In-Memory Search Trees
- Origins of the Simplex Method
- Out of the Tar Pit
- Outlier detection
- Overlapping Matrix Pattern Visualization: a Hypergraph Approach
- Overplotting: Unified solutions under Abstract Rendering
- Quantum Mechanics and Path Integrals
- Ownership and Reference Counting based Garbage Collection in the Actor World
- P: Safe Asynchronous Event-Driven Programming
- P tutorial
- PageRank as a Function of the Damping Factor
- pandas: powerful Python data analysis toolkit
- NumPy / SciPy / Pandas Cheat Sheet
- B-trees, Shadowing, and Clones
- Quantum Field Theory: A Modern Introduction
- Black Holes, Hawking Radiation, and the Firewall
- Memory Interleaving
- The Need for Asynchronous, Zero-Copy Network I/O
- The Zephyr Abstract Syntax Description Language
- Transactive Memory: A Contemporary Analysis of the Group Mfnd
- Lecture 9 CSE 260 – Parallel Computation (Fall 2015)
- Parallel Depth-First Search for Directed Acyclic Graphs
- Parallel Lossless Data Compression on the GPU
- Parallel Programming with Transactional Memory
- Parallel Random Numbers: As Easy as 1, 2, 3
- Parallel Scans and Prefix Sums
- Parallelism in Randomized Incremental Algorithms
- Parsing a SWIFT Message
- Parsing Gigabytes of JSON per Second
- Partial redundancy elimination for global value numbering
- Pattern Matching using suffix trays, arrays and trees
- Patterns of Software: Tales from the Software Community
- PC Assembly Language
- PCI Express Basics
- PE File Structure
- Portable Executable Format Layout
- A Walk Through the PE32 Format
- PE Injection Explained: Advanced memory code injection technique
- PeachPy: A Python Framework for Developing High-Performance Assembly Kernels
- Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
- Perfect Hash Families in Polynomial Time
- Perfect Hash Functions
- Perfect Hashing for Data Management Applications
- Perfect Matchings in 4-uniform hypergraphs
- Perfect matchings in large uniform hypergraphs with large minimum collective degree
- Perfect matchings in r-partite r-graphs
- Perfect Spatial Hashing
- Performance Analysis of BSTs in System Software
- Performance and Reliability Analysis Using Directed Acyclic Graphs
- Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors
- Persistence Programming Models for Non-Volatile Memory
- Persistent Memory in Windows
- The Java Language Environment
- Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation
- Pin Tutorial
- PL/Python – Python inside the PostgreSQL RDBMS
- PLWAH+: A Bitmap Index Compressing Scheme based on PLWAH
- Pointer Analysis
- Polynomial-Time Perfect Matchings in Dense Hypergraphs
- Pool tag quick scanning for windows memory analysis
- Proprietary versus Open Instruction Sets
- Porting Linux to a new processor architecture
- Porting of Win32 API WaitFor to Solaris
- Python Object Sharing (POSH)
- Standard for Information Technology — Portable Operating System Interface (POSIX)
- PowerShell Basic Cheat Sheet
- Practical Data Compression for Modern Memory Hierarchies
- Practical File System Design with the Be File System
- Practical Implementations of Arithmetic Coding
- P9489 Practicals and Exercises - Spring 2013
- Preemptable Ticket Spinlocks: Improving Consolidated Performance in the Cloud
- Prefix B-Trees
- Prefix Hash Tree: An Indexing Data Structure over Distributed Hash Tables
- Prefix Sums and Their Applications
- Purposes, Concepts, Misfits, and a Redesign of Git
- Lecture 19: Virtual Memory
- Principles of Computer System Design: An Introduction
- Principles of Distributed Computing
- Printing Floating-Point Numbers: An Always Correct Method
- Printing Floating-Point Numbers Quickly and Accurately with Integers
- Proactor: An Object Behavioral Pattern for Demultiplexing and Dispatching Handlers for Asynchronous Events
- Probabilistic Graph and Hypergraph Matching
- Probability and Statistics Cookbook
- Processing Relational Queries Using a Multidimensional Access Method
- Microsoft Build 2017
- Software Security Program: Analysis with PREfast and SAL
- Program design in the UNIX environment
- Program Synthesis By Sketching
- Programming Interfaces to Non‐Volatile Memory
- Programming Satan's Computer
- Programming with Hardware Lock Elision
- Programming With the x87 Floating-Point Unit
- It's damned cold outside, so let's light ourselves a fire!
- Proofs and Refutations
- Providing Safe, User Space Access to Fast, Solid State Disks
- Proving the Correctness of Nonblocking Data Structures
- Developer Toolchain for ps4
- Pseudo-Random Number Generators for Vector Processors and Multicore Processors
- Putting Coroutines to Work with the Windows Runtime
- PyEmu: A Multi-purpose Scriptable IA 32 Emulator
- PyParallel: How we removed the GIL and exploited all cores
- Python For Data Science Cheat Sheet
- Pythran: Enabling Static Optimization of Scientific Python Programs
- Quasi-Succinct Indices
- Quick introduction into SAT/SMT solvers and symbolic execution
- Comparative analysis between QuickThread and Intel Threading Building Blocks (TBB)
- Comparison between QuickThread and OpenMP 3.0 under various system load conditions
- QuickThread
- Superscalar programming 101 (Matrix Multiply)
- RadixVM: Scalable address spaces for multithreaded applications
- Rainbow matchings in r-partite r-graphs
- Reactor: An Object Behavioral Pattern for Demultiplexing and Dispatching Handles for Synchronous Events
- Real Programming in Functional Languages
- Real-World Concurrency
- Realizing quality improvement through test driven development: results and experiences of four industrial teams
- Real-Time Parallel Hashing on the GPU
- Reasoning about temporal relations: a maximal tractable subclass of Allen's interval algebra
- Recognizing Unordered Depth-First Search Trees of an Undirected Graph in Parallel
- Recollections of Early Chip Development at Intel
- Reconsidering Custom Memory Allocation
- Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses
- Reducing the Space Requirement of Suffix Trees
- Reevaluation of Programmed I/O with Write-Combining Buffers to Improve I/O Performance on Cluster Systems
- Refactoring the FreeBSD Kernel with Checked C
- Reflective DLL Injection
- Register Level Sort Algorithm on Multi-Core SIMD Processors
- Regular and almost universal hashing: an efficient implementation
- Introduction to Linux: A Hands on Guide
- Relative Suffix Trees
- Remote Library Injection
- Repeating History Beyond ARIES
- Replacing suffix trees with enhanced suffix arrays
- Resumable Functions (revision 3)
- Rethinking SIMD Vectorization for In-Memory Databases
- Retroactive Data Structures
- Retrofitting Word Vectors to Semantic Lexicons
- Reverse Engineering for Beginners
- Reverse-Engineering Instruction Encodings
- Rewriting History
manjunath5496 / technical-oriented-papers Goto Github PK
View Code? Open in Web Editor NEW"Our greatest weakness lies in giving up. The most certain way to succeed is always to try just one more time."― Thomas Edison