Anyone use Anaconda for your python install/package management? Just curious if anyone has had success upgrading anaconda from 2.3 to 2.4 and python from 3.4 to 3.5 without running into all sorts of issues. It seems that they don't make it very easy or efficient to upgrade. Almost easier to do a full clean install of Anaconda 2.4/python 3.5
How long did it for you guys to get used to using the debugger GDB? All of my work from now on has to be done in a command line unix/linux programming environment and debugging with GDB is slower than what I was doing with Visual Studio's by like a factor of 5 at least.
It's a content management system, like Wordpress, but I believe it's a little more robust and customizable than WP.
Anyone have a suggestion for the best way to install R on a windows machine? Just saw that Anaconda has an r-essentials thing, not sure if I should try that or just do the standard R install and then download R-studio
It continues to get better with each release but out of the box Drupal kind of sucks for the novice user. It tends to require more initial configuration in order to deliver a decent user experience than its chief competitor WordPress. But that said it is supremely flexible and configurable, and can cover a lot more ground than WP out of said box. Drupal can be a delight for non-technical content owners - provided that the person(s) setting it up are really comfortable in its ecosystem and know exactly how to configure it on behalf of their client. If you are working an SEO angle, you may want to look into modules that allow users to manage metadata and microdata.
i'm in the process of applying/interviewing for a startup job they use ASP.NET and a little bit of C#, two languages which I have never used. Anybody have experience with ASP.NET that can give me advice?
learnvisualstudio.net C# course is what i'm going through right now, and it's actually a really good place to start imo. plus it's free.
well- what's the best way to learn? And what non-Microsoft language does ASP.NET's functionality most closely resemble? I currently use mostly Python, with some JS.
It stems from C but I've heard it closely resembles Java in the overall structure and the strongly-typed nature of the language. There are a ton of resources out there (MSDN is huge for once you actually get familiar with the language: https://msdn.microsoft.com/en-us/library/67ef8sbd.aspx). I would start with learnvisualstudio.net like another poster mentioned, and https://channel9.msdn.com/ is also a pretty good resource with a number of videos. If you are really committed to learning, there are some sites out there which require a subscription (Pluralsight is my favorite), but they have a TON of knowledge on everything (not just including C#).
Probably depends on what you want to do with it. R+RStudio is great, and is what most use. Can't go wrong with it really. But if you are planning on consistently promoting code up through managed environments and need to make sure the right packages are always in the right place, are constantly sharing code amongst others, etc. then the Conda distribution is probably easier for that. Or if you are already very familiar with Anaconda, Jupyter, etc., like Jupyter notebooks, and want to use Python right along with R, then the Conda distribution might be the way to go.
I don't understand that at all ASP.NET , as I understand, is a version of ASP that was made for the .NET platform. It's all designed for the web. How could something like that be based off C?
yeah team anaconda I spent like 3 hours trying to get the oracle_python library installed properly (downloaded a bunch of oracle files - instant client/odbc driver) still wouldnt work then via stackoverflow I tried conda install https://conda.anaconda.org/anaconda cx_oracle and it downloaded everything for me and worked
Meant the language (C#), sorry for the confusion. .NET has a few different languages that can be used, but the most commonly used is C#, which stems from C.
Go for it. I'm about halfway through. I already have a degree in analytics, so it doesn't really introduce any new concepts for me, but it helps keep things fresh, and I want it for the resume. Plus, I like messing around with different data and tools, since what I do at work is always the same types of (boring) data. At the very least, it'll teach you/force you to learn R, if that's your goal.
this one is different than what you are thinking this uses hadoop, spark, hive, etc the cloudera ecosystem
Oh OK, I thought that you posted the data science specialization for some reason. That one looks cool, too. I've been working hard to get my work to move our system over to hadoop, and I'm getting ready to run a POC of Cloudera's Hadoop distribution here at work in a couple of months. Hortonworks makes their ecosystem is easier to play around with though. They have a fully configured sandbox that runs on VirtualBox or VMWare Fusion, and a bunch of tutorials to go along with it. It's not a true distributed system, since it's really just one node, so it isnt very performant, but you can still get the hang of Hive, Pig, Spark, etc. with it.
the time has come https://github.com/yhat/rodeo Its a python version of RStudio Has worked really well for me so far. Displaying visuals like Bokeh and Highcharts doesn't seem to work but matplotlib and seaborns work so far
That looks awesome. I rarely use IDE's, mostly just vim and debug as I go, but I'm excited to try this.
i gotcha. So a typical .NET application would use C# for most back-end logic and ASP.NET for most front-end logic, i assume? the way things were explained to me, is that there's a dashboard that's hard-coded into the back end logic, and one thing i'm going to be working on is bringing some sense into that, i.e. convert it to a back end API with a front end that calls the API.
ASP.NET stuff using C# is kind of like writing Object Oriented C#, but it's not strict OO programming. If you're using the MVC paradigm (which I've only done and can't speak to anything else), it's similar to writing something using MVC with a javascript library (i.e. you have controllers that are responsible for views and for making calls to your routes and your backend should solely be responsible for storing and retrieving data). For C# ASP.NET you can use Razor or ASPX for your HTML, both give you some syntatic sugar where you can write logic on the page itself to determine what and how your HTML renders. I guess defining your controllers and view models might be were you'll need to learn up on if you're not used to strongly typing your things. Your backend will be where you'll need some familiarity writing with C# where you need to create classes, interfaces, proper inheritance, and then some experience with maybe Entity Framework if you're trying to store data with SQL Server. EF is super awesome once you get familiar with it and you can do awesome things where you create your tables first, define your table relationships, and then create classes based off. There's also the more traditional route where you define your models first and then create your tables based off that. I'm getting a little rusty with my ASP.NET stuff, so some of this might be oversimplified and not entirely correct, but I used it solely at my previous job.
ugh, goddamit i hate microsoft vocabulary. i understood some of that. it's frustrating that i've done all of this, just in a different non-microsoft framework, so i don't know what the fuck "model views" or "controllers" or "entity frameworks" are. thankfully wikipedia is helpful:
yeah MVC isn't ASP.NET specific, it's a way of adding a layer between your frontend and backend. Entity Framework is an ORM that allows you to create a model (think of logging in as a User to TMB). If you're using SQL, our User information may exist across multiple tables and an ORM helps grab a hold of the model without having to write a bunch of SQL with lots of JOIN statements.
Assuming I'm a relatively smart guy and have my share of time on my hands, how long would it take me to become proficient at using SQL?
sql is pretty simple to pickup - I think it took me 6 months to be pretty good at it, but I wasnt using sql everyday at that point. the hard part (for me at least) was understanding where to look in the database - which table is a certain field in all depends on how "clean" your data is and how much data is in your data warehouse
Very quickly. sqlnet, sqlzoo, and w3 schools have good starting tutorials. After that you really just need a data set that has real-world meaning to you to play around with.
Quickly. Shit is easy to get the basics down and then a little more, but might take some time penning some massive store procs and being an uber stclackoverflow nerd
SQL isn't hard to understand and is very useful to know for so many different types of jobs, whether you're a Programmer/DBA or not. However writing powerful stored procedures that you try to optimize (seriously, use use use indexes and composite indexes if your machine can spare the disk space) which use aggregate functions, pivoting, and derived tables is truly a specialization. An amazing DBA at my last job optimized the shit out of our application since we were working on the order of millions of row inserts that had to be grouped and counted.
I wouldn't be doing too much more than you can do in Excel in terms of manipulating the data, but I have 9.2 million lines of data to deal with.
What do you want to do with the data? Is it already in a database? All DBMS have many powerful built-in functions these days (some more than others), so you can do a ton of different things with your data, but depending on what you are trying to do, there may be better options.
The data is addresses that are parsed into the various components of the street address – number, street name, city, ZIP, etc. I want to be able to ignore some of the columns and identify addresses that are duplicates based on the unignored columns.
There are some text-mining tricks that you may want to look into to make sure you're comparing like to like (looking for extra spaces, etc) but it should be fairly easy. Just group by the columns you want to compare, count them up and see where it's > 1
This should work to identifydups Code: Select a.* ,Row_Number() Over(Partition By a.ignore_column, a.ignored_column, a.ignored_column Order by one_of_those_ignored_columns) as Whammy From whammys_work_table a Qualify Whammy = 2
Finally got around to trying this stuff out, 6 months after I was tagged in the OP. Been enjoying the Coursera Python courses from UM, currently on the 3rd one. Trying to find something more challenging.
there is so much to learn and so many new things that keep coming out in the big data space can get overwhelming