Events Sponsored by Dropbox, Ness, Boeing, & Stanford Computer Forum.
Winter
24 hours · Open Source Theme · Local VCs on judge panel · Win plane tickets to Las Vegas · Unlimited Caffeine · Midnight Korean BBQ · Be there
An overview of the job search process from the point of view of a graduating senior, focusing on the key decision points needed to get a job and start a career you will enjoy.
By Paul Twohey, Co-Founder of Ness Computing.
ACM and Low Battery will once again be hosting a video game party featuring the popular game "League of Legends." Riot Games (creators of LoL) will be there with free food, swag, and representatives to answer your questions about the games industry. Bring your laptop/desktop for some epic gaming action, or play on the consoles we'll have available.
Fall
Join ACM and Low Battery for our first video game party of the quarter! This LAN's theme is LAN of Legends, because it will be featuring a League of Legends tournament as well as swag from Riot. Bring your laptop/desktop for some epic gaming action, or play on the consoles we'll have available.
ACM and Low Battery present our second video game party of the quarter: FPS Fest Part Deux! We will have tournaments in Counter Strike:GO and Call of Duty, with the winners having the opportunity to represent Stanford in a Black Ops tournament against Cal. As always, bring your laptop/desktop, or play on the consoles we'll have available. All games welcome!
Tech Talks
Sponsored by Yahoo. Every Friday @ 6 PM in Gates 104.
If you want to give a tech talk, read this.
Fall
Engineering Challenges at Quora
Given a discrete probability distribution, one can sample from it quickly using standard algorithms such as the alias method (Walker 1977). However, these algorithms assume the distribution does not change between samples, that is, the sampling is done with replacement. We present an efficient algorithm for sampling from a discrete probability distribution without replacement and its application at Hulu.
Development efficiency is vital to save time and iterate quickly. In an effort to improve front-end efficiency, Nodefront, a Node.js-powered utility, bundles a static site compiler, live development server, library fetcher, and build utility into one, simple command-line tool. Best of all, it's free and open source. Join Karthik for an introduction to Nodefront and how it can help your workflow.
Slides for the presentation are available online.
An in depth look at MongoDB and how we're leading the NoSQL revolution: 10gen is the company behind MongoDB, the open source document-oriented database helping tens of thousands of companies quickly and easily deliver, scale, and operate applications.
Reactivity and the realtime web: Reactivity is the key to writing next generation realtime web applications. This tech talk will break down how we think about reactivity at Asana, and provide a glimpse into some of the cutting edge reactivity research we are pursuing.
(5pm) In this tech talk Katherine Scott and Anthony Oliver from the lead developers of SimpleCV will discuss the current state of the art of computer vision in Python, using Python as a free and open-source alternative to Matlab, and our experience creating a start-up that focuses on web-enabled machine vision for manufacturing.
(6pm) "Hacking the Graph." Addepar's mission is to fix global finance by creating transparency and reducing risk. We'll present our idea of a global financial graph. We'll show the type of computations we can do on that graph to reveal risk in a novel way.
At Palantir Health, we want to empower the bioscience and healthcare experts who are solving the hardest problems in their domains to work with data as if they were computer scientists. Lekan Wang, Product Lead for Palantir Health (BS ‘09, MS ’11), and Abi Raja (’13), will talk about and show a few experimental summer projects they worked on toward this goal.
MySQL to NoSQL, the great debate like Cal vs. Stanford
Optimizing Hash Aggregation to Best Utilize the CPU - In my summer working on Cloudera Impala, I was tasked with finding the optimal hash aggregation algorithm. The naive implementation of this operation has an essentially random memory access pattern and thus slows down dramatically when the size of the hash table exceeds the size of processor cache. This talk will discuss the challenges and strategies for optimizing algorithms to best take advantage of modern CPU's and how this applies to the problem of hash aggregation.
Prismatic delivers realtime newsfeeds for thousands of learned topics based on your interests and social networks. The Prismatic backend requires many custom abstractions, sophisticated machine learning techniques, data crawling, and graph analysis algorithms. We discuss how our small backend team (just four engineers) built this stack in record time using a unique design philosophy facilitated by Clojure. In a nutshell, our problem-solving approach prefers lightweight composable custom abstractions for problems rather than monolithic open-source frameworks (e.g., Hadoop).
(1 hour) Game development using a tool called Construct 2, which allows you to build games and export to a number of platforms including Windows 8. There will be a tutorial to build an introductory shooter game, and you will learn how to use concepts like objects, events, and behaviors.
(1 hour)Secure off-line web apps with Firefox OS: Firefox OS is Mozilla's new project to develop an open and hackable mobile phone operating system. The talk will cover the basics of the architecture and porting web applications to use secure APIs. Bring your laptop and we will go over downloading and using B2G Desktop, the Firefox OS simulator that runs on Windows, Mac, and Linux. We will cover installation, basic usage, web application development and installation, remote debugging,and advanced Web APIs.
Winter
At TrialPay, an alternative payments platform with over 200 million users and 100 million impressions/day, we face many security-related challenges. Our CTO and co-founder, Eddie Lim, will be covering best practices and pitfalls to avoid when storing user credentials and credit card numbers.
ArduSat is a prototype nanosatellite platform that aims to give independent citizens the ability to perform affordable, on-demand space science. It is the first of many satellites that will carry the Arduino microprocessor, which NanoSatisfi (the team developing the satellite) aims to use to stimulate Science, Technology, Engineering, and Math (STEM) education and increase student interest in science and technology. The platform will be the first satellite to give unrestricted access to space hardware to private citizens worldwide.
Matt Dalio, CEO of Endless and Stanford alumnus, along with Stanford CS student Patrick Ward, will demo their product and discuss methods for developing a free Linux desktop environment and apps to serve billions in developing countries via smartphones. They’ll discuss portability, distribution, ease of use, network infrastructure constraints, and security.
(one hour) Salesforce Touch offers a new way for sales people to engage with customers, colleagues, and the world. Now reps and managers have the ability to close deals wherever they are, on any mobile device. With Touch, CRM is just the beginning. You can easily extend your success and take advantage of the power of HTML5 to build custom mobile apps on the Touch platform.
Trends in database technologies:Over the last few years we've seen a range of NoSQL stores from MongoDB to Cassandra to Neo4j break the traditional dominance of relational databases. This talk will look at the evolving use cases for the four main types of NoSQL data stores and will then look at the next big thing in databases - immutable data stores like datomic.
Druid is a distributed column-oriented analytical data store in use at Metamarkets (http://www.metamarkets.com). One Druid cluster currently exposes a data set of >30 billion rows of data representing >1.5 trillion impressions. The talk will discuss design considerations and architecture of the system.
Costa Sapuntzakis (PhD '06) of Pure Storage will talk about how to build an enterprise storage system around flash. The system is 10x faster than disk-based system, consumes 10x less power, takes 10x less space, is a lot simpler to setup and use, and for many workloads costs less. You'll learn about the architecture and data structures that make this thing tick as well as more general insights on using flash well. About the speaker: Constantine Sapuntzakis (PhD '06) works on Purity at Pure Storage. In 2005, Costa co-founded moka5, Inc., to commercialize his doctoral research into the delivery of software through virtual appliances. While at Cisco Systems in 2000, Costa developed the initial draft of the iSCSI protocol along with IBM.
Building high performance web applications:Learn how to build complex web applications that are fast and responsive. We'll cover a bunch of tips and tricks, which we used in developing our "Photos" page on the web site, including: - Taking advantage of automatic browser caching - Lazy loading of images - Hacking around DOM limitations - Making AJAX requests progressive to enable incremental loading - Using HTML5 local storage
(one hour) Scaling the Medical Genome: Counsyl is now doing genomic testing on more than 2% of all births inthe United States.We're a Bay Area genomics startup that's developed an affordable pre-pregnancy test featured in the New York Times, named one of Scientific American's Top 10 World Changing Ideas, and won the Wall Street Journal Innovation Award for Medicine. We will cover interesting material in robotics, sequencing, machine learning, or building the UI for the medical genome
Sift Science fights fraud with machine learning. This talk will present the kinds of fraud endemic to the Internet and how we use large-scale machine learning to predict it and fight back. We will focus on the system and algorithmic challenges of "real world" machine learning from hundreds of millions of complex, heterogeneous events and how Sift solves them.
Architecting a Scalable & Simple Web Frontend:Backbone.js and Bootstrap are great tools to use as a foundation for your product... but where can you go from there? This talk highlights front-end challenges and triumphs at Backplane, a Palo Alto based startup building communities and solving the problem of organized mass communication. We will go through some basics but will focus heavily on advanced topics such as optimizing frontend deployment, making up for existing libraries' weaknesses, JS OOP design patterns, and engineering your frontend for a growing organization.
Spring
Technical challenges scaling a service that sees over 100 million photos shared each day. Keeping request latency low while maintaining reliability and security with traffic patterns very unique to our application.
A candid talk about building and designing Hack Design (http://hackdesign.org/) in just 2 weeks and how to iterate quickly by designing in code. Topics touched upon include how to get started, best practices, automation, etc.
(One Hour)Realtime Ads on Twitter: A salient feature of Twitter is its real-time nature. As an event unfolds, Twitter users produce, as well as consume, tweets pertaining to this event. In most cases, trending events are never seen before and are short-lived. One of the interesting challenges we'll discuss is monetizing realtime, ephemeral realtime events.
A technical discussion of the various strategies Splunk employs for indexing, searching, and reporting over huge unstructured datasets. We will cover the actual structure of our indexes on disk, as well as how Splunk leverages them to provide high performance searching and reporting without an upfront schema. We'll discuss some of the challenges in reporting on structured versus unstructured data, and a sneak peek of some new upcoming technology that addresses these challenges. In lieu of a traditional dull powerpoint presentation, this will be purely whiteboard and demo based.
Sneakpeeq is re-imagining shopping experiences from the ground up. This has been fueled by the innovations in the web application space that has exploded as a large set of emerging frameworks has arrived on the scene. We'll dive deeper into one of these, AngularJS, and explore how we apply it at Sneakpeeq to write responsive and performant applications with a clean separation of concerns.
Software Defined Networking in the 10/40/100G Data Center: An overview of how Arista is pioneering new SDN switching technologies in the public/private cloud, and multi-tenancy Next Gen Data Centers. - Intro to Arista EOS: the power of running Linux in your network - The Command API: a JSON-RPC interface to the CLI - OpenStack integration - Fast Server Failover: the network and Hadoop working better together
At Ness Computing, our mission is to make recommendations personal. Ness’ first product is a mobile restaurant recommendation product. In order to provide such a service, and provide a more relevant experience for users, there are a number of concrete sub-problems that must be solved, which will be discussed. The first is Canonicalization, taking many different data sources to create a single set of golden data that is robust to noise, and cleaner than any of the individual datasets. The second is the use of Collaborative Filtering to provide personalized scores for how much a user will like one location versus another. The third is more intelligent recommendations, using the personalized score, along with other signals like time of day to suggest restaurants a user is likely to enjoy.
The Information Structure of Web Layout: There's a lot of information on the web, but most of it is designedfor people, not for automatic use by computers.Diffbot is a service that extracts and labels information from web pages by "machine reading" them, to make the information usable for aggregation, indexing, mashups, and automatic analysis. We'll talk about the machine learning tasks that Diffbot uses to extract information, and the stages of that processing, from HTML rendering to machine vision to NLP, and give an overview of the challenges in building a system able to interpret real-world web data, which require robust engineering as well as a combination of diverse approaches from NLP, the RDF/semweb community, and design semantics. Along the way, we'll demo some behind-the-scenes components of the technology and show how real production systems work. Bring your laptop!
Agile Data and Machine Learning: How Sharethrough uses the power of Scala, Hadoop, and agile development to built data driven ad products. Young engineers and data scientists often get overwhelmed and caught up with the latest algorithm. We'll discuss how the tools and concepts of software engineering are just as important as picking the right algorithm when building data products. In a startup environment the ability to rapidly iterate on underlying algorithms while shipping product is an exciting challenge, especially given the size of modern data sets.
Low Latency Programming and HPC Work Flow Management for Algorithmic Trading: Algorithmic trading strategies utilize machine learning and statistical models created using terabytes of data. These strategies stream live data from the exchanges and in a fraction of a second make a decision to buy or sell. Our talk will discuss some of the problems that we face in creating and running these strategies and the techniques and solutions that can help address these issues in any low latency, high throughput algorithmic system.
Scopely is creating the next generation of social mobile games, building a platform for app development with a focus on technology, distribution, and monetization. Through a new paradigm for the development, distribution, and monetization of mobile apps, we enable third-party game developers to build, retain and monetize a loyal engaged audience within a highly competitive app ecosystem. We have also created a social platform to engage users with brand advertisers. Scopely will discuss mobile gaming's unique opportunities, the technologies we use, and tools leveraged to increase monetization, retention, and drive awesome gameplay.
Real-Time Tracking of NASCAR cars for Live TV and Media: Learn how data, such as position, orientation, and velocity, is collected from a full field of at-speed, NASCAR racecars and transformed into real-time content for TV, the internet, and mobile devices. A combination of technologies including military-grade GPS, compact IMUs, and radio communication will be discussed.