Data Best Practices Workshop - 2019/03/27
At the end of March 2019, the first Data Best Practices Workshop was held in the SAPP Center with the generous support of the Wu Tsai Neuroscience Institute. 38 individuals registered for the three hour workshop to learn about how to work with data efficiently, with a focus on free, secure, rapid, and redundant storage. All materials for the workshop are available online and the workshop itself was recorded (both resources restricted to the Stanford Community).
Note: Since the workshop, Google changed the behavior of Apps Script such that Cloud Projects can no longer be created via the steps shown in the video and thus OAuth credentials cannot be generated for Google Drive. OAuth credentials can be set up for free at http://console.developers.google.com but require an active Google Cloud Project. This can be accomplished in two ways:
Request a Google Cloud Platform (GCP) account from Stanford IT
Use a non-Stanford Google account (e.g., @gmail.com address) to create a Google Cloud Platform Project
The alternative is to leave the client_id & client_secret fields blank in rclone. This is fine for testing purposes but not advised for production, as transfer rates will be significantly reduced.
A survey was taken at the end of the workshop, with 36 responses from attendees. Survey results appear below:
Participants were also asked about what other offerings they would like to see (checkbox section).
At least 70% of participants wanted to see:
Data workflows/pipelines (78%)
Parallel computation strategies in cluster environments (78%)
High-performance interactive (single user) data processing (72%)
At least 30% of participants wanted to see:
Data organizations principles (39%)
Optimization of data access/loading (minimizing latency and maximizing throughput) (39%)
Securing and sharing data best practices (33%)
Data file format discussions (31%)