Welcome to MIM Central

MIM Central is an information hub for the MIM community, providing news, resources, and opportunities for students, faculty, alumni, and potential students interested in Information Management at University of Maryland’s iSchool.

If you have suggestions, post a reply to this message or send the MIM Director, Brian Butler a note at (bsbutler@umd.edu).

Posted in Site Administration | Leave a comment

10 Hottest Healthcare IT Developer and Programming Skills

A convergence of technology, legislation and mandated migration to ICD-10 medical classifications make healthcare one of the hottest areas within IT. Here’s a look at the skills most in demand, who’s hiring and where the jobs are.

According to the U.S. Bureau of Labor Statistics, the healthcare industry is leading the market in jobs creation. This shouldn’t be surprising when you consider all that’sgoing on within healthcare and the technology needed to support it. So what type of programming and developer skills are healthcare employers looking for?

Where the Jobs Are

Let’s begin with where the most healthcare IT jobs are within the U.S. According to Indeed’s data, Boston; Washington, D.C.; and New York City top the list of cities with the most HIT developer jobs.

There are a number of factors to consider if you’re going to a new city for a job. For example, salary, the cost of living, traffic and commuting conditions all play an integral part of the decision-making process when considering changing locations.

SQL

SQL (Structured Query Language) was originally based on relational algebra and tuple relational calculus, and consists of a data definition language and a data manipulation language that is used for managing the data in a relational database management system (RDBMS). One of the most useful healthcare IT skills, SQL enables developers to insert query, update and delete data, as well as create and modify schema creation and data access control.

Java

Java was created and released in 1995 at Sun Microsystems by James Gosling as a core component of Sun’s Java platform. The syntax of Java comes mostly from C and C++, although it features less low-level facilities than the two programming languages. Java was designed to have few implementation dependencies, and is considered to be a general-purpose, concurrent, class-based, object-oriented computer programming language that allows developers to “write once, run anywhere” (WORA), providing platform independence with no recompiling necessary.

In the healthcare industry, Java is popular for the creation of small to large embedded devices, and is often used for the development of remote patient monitoring applications and diverse and robust sensors.

HTML

The HyperText Markup Language, or HTML, is the premier building block of the Web, and is used for the creation of Web pages, and with the advent of the latest version, HTML5, Web applications. A standard Web browser, whether it’s Internet Explorer, Chrome, Firefox, Opera or the mobile Dolphin browser, reads HTML-based documents and converts them into visible or audible Web pages by reading the HTML tags to interpret and display the contents of the page. By creating apps using HTML5, healthcare workers are able to access the same data regardless of the Internet-connected device they are using.

JavaScript

JavaScript is a multi-paradigm language that supports object-oriented, imperative and functional programming styles. It’s an interpreted programming language whose key design principles were taken from the Self and Scheme programming languages. JavaScript was originally used within Web browsers so that client-side scripts could provide user interaction, browser control and asynchronous communication, as well as the capability to alter the displayed document content.

JavaScript has evolved into a prototype-based scripting language that, along with HTML5 and CSS3, is used for game development and full-fledged healthcare application development.

XML

The use of standards is pivotal in providing healthcare providers with the capability to interoperate and share patient records more effectively. XML, the Extensible Markup Language, is an open standard markup language which is used to define a set of rules for encoding documents in a human-readable and machine-readable format. It is often used for the representation of arbitrary data structures and emphasizes simplicity, generality and usability. “XML is becoming more widely used in interfacing between systems, providing a standard architecture,” says Montogomery.

C#

The C# programming language is a multi-paradigm language that involves imperative, generic, declarative, procedural, functional, class-based, object-oriented and component-oriented programming disciplines that provide developers with the functionality needed to create sophisticated applications for the healthcare industry, including Electronic Medical Records (EMR) Systems, Laboratory Information Management Systems (LIMS, LIS), EMR Alerting Systems and more.

Created by Microsoft as part of its .NET initiative, C# was meant to be a simple, modern, general-purpose, object-oriented programming language, but has proven itself to be much more.

C++

C++ is an intermediate-level programming language that includes the functionality of both high-level and low-level languages. It was created by Bjarne Stroustrup in 1979 at Bell Labs, and was originally called C with Classes, as it added object-oriented features–most notably classes–to the C programming language.

C++ is still one of the most popular programming languages, and is used as an efficient compiler for native code. It’s used for system software, device drivers, high-performance client-server software and among other uses in the healthcare industry, it’s used to provide the internal functionality of medical imaging analysis devices. “C++, C and C# are all used in back-end programming of HIS systems,” says Montgomery.

ASP.NET

Unlike the other technologies covered here, ASP.NET is a server-side Web application framework that was designed by Microsoft in 2002 to enable developers to create dynamic websites, Web applications and Web services. It was created as the successor to Microsoft’s Active Server Pages (ASP) technology, and was built on the Common Language Runtime (CLR), which allowed programmers to code ASP.NET using any supported .NET language.

ASP.NET is used within the healthcare industry for the creation and implementation of Web-based Software-as-a-Service (SaaS) application suites, electronic payment processing systems, healthcare data management systems and more.

PHP

PHP–which initially stood for Personal Home Page and is now a recursive acronym that stands for PHP: Hypertext Preprocessor–is a server-side scripting language that is most commonly used for Web development, but is also used as a general-purpose programming language. It was created by Rasmus Lerdorf in 1995, and is now installed on more than 200 million websites.

A Web server that has the PHP processor module installed interprets the PHP code that is embedded in an HTML document, but it can also be used in standalone graphical applications or even through a command-line interface. It is still often used in the healthcare industry, for example, in the instance of Mindfire Solutions, to create a Web-based Secure Electronic Health Record management application.

C

C is the oldest programming language covered in this slideshow. It was developed by Dennis Ritchie between 1969 and 1973 at AT&T Bell Labs. It has facilities for structured programming and allows lexical variable scope and recursion, and was designed with a static type system to provide unintended operations. It’s a general purpose programming language that provides constructs that are able to efficiently map to normal machine instructions and is therefore often used in legacy applications that were previously developed using assembly language, especially system software such as that of the Unix operating system. Many healthcare institutions still rely on computers that are running Unix, and, for that reason, C programming continues to be a vital skill within the industry.

To read the full article:

http://www.cio.com/slideshow/detail/103069/10-Hottest-Healthcare-IT-Developer-and-Programming-Skills#slide13

 

 

Posted in Development, Health IT, IM Topics | Leave a comment

Price Waterhouse Cooper Talk on April 22nd!

PWC will be coming to talk to  MIM students on April 22nd (tomorrow!) at 4pm in 2116.   Alum Ben Ferko will be here along with employees Rasheed Shaik and Liyao Wan to discuss Big Data Visualization Techniques and Technologies and collect resumes.  

Posted in Events, Speakers | Leave a comment

4 Qualities to Look for in a Data Scientist

Every business, it seems, needs a data scientist, but not everyone knows what to look for. The four qualities of a good data scientist described here will help you first write a job description and then evaluate candidates for your data scientist vacancy.

It’s hard to resist the sparkly nirvana that big data, leveraged appropriately, promises to those who choose to embrace it. You can transform your business, become more relevant to your customers, increase your profits and target efficiencies in your market all by simply taking a look at the data you probably already have in your possession but have been ignoring due to a lack of qualified talent to glean value from it.

Enter the data scientist — arguably one of the hottest jobs on the market. The perfect candidate is a numbers whiz and savant at office politics who plays statistical computing languages like a skilled pianist. But it can be hard to translate that ideal into an actionable job description and screening criteria.

This article explains several virtues to look for when identifying suitable candidates for an open data scientist position on your team. It also notes some market dynamics when it comes to establishing compensation packages for data scientists.

Because “data scientist” represents a bit of a new concept, without a lot of proven job descriptions, you’ll want to work closely with your human resources department on the rubric and qualifications you use to screen initial resumes and also set up a first round of interviews. What follows are five salient points that should prove useful as qualify candidates for a data scientist role.

 1. A Good Data Scientist Understands Statistics and Laws of Large Numbers

Trends are seen in numbers. For example, a good data scientist understands, “This many customers behave in this certain way” or “This many customers intersect with others at this many precise points.” Over large quantities of data, trends pop out in numbers.

A great data scientist has the skillset to understand trends in large numbers and an ability to translate that into predictive analytics. A good data scientist can interrogate large quantities of data and extract trends, then use predictive modeling techniques to anticipate behavior across that aggregate dataset. Statistics are also helpful in preparing reports for management and prescribing recommended courses of action.

While a mathematics degree would be ideal, many qualified candidates have taken a slightly more practical academic path. Don’t be scared away by interviewees who lack advanced mathematics credentials. A focus on statistics in a candidate’s academic career, whether at the bachelor level or above, would prove sufficient for this type of position.

2. A Good Data Scientist Is Inquisitive

Part of the allure and mystique of big data is the art of teasing actionable conclusions from a giant haystack of (typically) unstructured data. It’s generally not enough to know how to write queries to find specific information without being able to generate the context of what queries should be run, what data we would like to know and what data we might not know we would like to know but that could possibly be of interest.

Yes, great data scientists execute queries and database runs, but they also design suggestions for architecting queries in ways that not only return a defined set of results to answer a question someone already asked, but that also reveal new insights into questions that have not yet been asked by an organization. This is where the real value of a data scientist will present itself over the coming years.

While some might argue that this is a soft skill that’s difficult to interview for, carefully crafted hypothetical scenarios presented to candidates during interviews can help you understand their thought process, their approach to a problem, the various ways the candidate would attempt to glean the answers to the problem and what other questions the candidate could pose that would add value to the original query. Stress to candidates during the interviews that outside-the-box thinking is encouraged, while limiting answers to only the problems posed is discouraged.

3. A Good Data Scientist Is Familiar With Database Design and Implementation

It’s important for today’s data scientists to sit somewhere between an inquisitive university research scientist (which is essentially what the previous point describes) and a software developer or engineer: Someone who knows how to tune alab and operate machinery well.

Even though much of what falls under the “big data” category is known as unstructured data, a fundamental understanding of both relational and columnar databases can really serve a data scientist well. Many corporate data warehouses are of the traditional row-based relational database sort. While big data is new and alluring, much actionable data and trends can be teased from traditional databases.

Data scientists will also play a key role in setting up analytics and production databases to take advantage of new techniques. A history of working with databases would provide great context for setting up new systems in the new role.

Additionally, many big data software developers attempt to use SQL-like language in their products in an attempt to woo traditional database administrators who have no desire to learn a MapReduce-like language. Knowledge of traditional SQL will continue to pay dividends, allowing data scientists to play nicely and integrate well with other database professionals that you already have on staff.

4. A Good Data Scientist Has Baseline Proficiency in a Scripting Language

Your most qualified candidates should be awarded extra points for knowing Python at least somewhat well. Many query jobs over vast quantities of unstructured data are issued in scripts and take quite some time to run.

Python is generally accepted as the most compatible, most versatile scripting language for working with columnar databases, MapReduce-style queries and other elements of the data scientist puzzle. Python is an open source language known to be fairly usable and easy to read, so it shouldn’t pose much of a hurdle for your base of data scientist candidates to overcome.

You could also consider “pseudo code” skills, or the ability to write almost in plain English how an algorithm or a query would work. Such a test would show the quality of the thinking and the approach to a problem, as well as how such a problem would begin to be solved by your applicant, regardless of if he or she actually possesses the skills in any given language to pull it off.

Be Prepared to Show Data Scientists the Money

As demand for data scientists increases, and as long as the supply of qualified candidates is being outstripped by it, salaries are rising. In almost any metro market in the United States, data scientists are receiving six-figure base salaries — obviously higher in high cost markets such as the West Coast. In Silicon Valley, in particular, multiple offers for a qualified candidate are not uncommon.

Don’t attempt to pay below market rates for this position. Even startups are paying data scientists comfortable wages and giving them the ability to work on challenging new products, unlike their traditional modus operandi of loading up in equity positions and paying measly wages. Put simply: Don’t cheap out and expect great talent.

To read the full article:

http://www.cio.com/article/751478/4_Qualities_to_Look_for_in_a_Data_Scientist?page=1&taxonomyId=600010

Posted in Data Analytics, Data Management, Data Mining, IM Topics | Leave a comment

10 Hot Hadoop Startups to Watch

As data volumes grow, figuring out how to unlock value becomes vastly important. Hadoop enables the processing of large data sets in a distributed environment and has become almost synonymous with big data. Here are 10 startups with solutions for unlocking big data value.

It’s no secret that data volumes are growing exponentially. What’s a bit more mysterious is figuring out how to unlock the value of all of that data. A big part of the problem is that traditional databases weren’t designed for big data-scale volumes, nor were they designed to incorporate different types of data (structured and unstructured) from different apps.

Lately, Apache Hadoop, an open-source framework that enables the processing of large data sets in a distributed environment, has become almost synonymous with big data. With Hadoop, end users can run applications on systems composed of thousands of nodes that pull in thousands of terabytes of data.

According to Gartner estimates, the current Hadoop ecosystem market is worth roughly $77 million. The research firm expects that figure to balloon to $813 million by 2016.

Here are 10 startups hoping to grab a piece of that nearly $1 billion pie. These startups were chosen and ranked based on a combination of funding, named customers, competitive positioning, the track record of its executives, and the ability to articulate a real-world problem and explain why the startup’s solution is an ideal one to solve it.

1. Platfora

What They Do: Provide a big data analytics solution that transforms raw data in Hadoop into interactive, in-memory business intelligence.

Headquarters: San Mateo, Calif.

CEO: Ben Werther, who formerly served as vice president of products at DataStax.

Founded: 2011

Funding: $65 million to date. The latest round ($38 million Series C) was locked down in March. Tenaya Capital led the round, while Citi Ventures, Cisco, Allegis Capital, Andreessen Horowitz, Battery Ventures, Sutter Hill Ventures, and In-Q-Tel all participated.

Why They’re on This List: As with many startups on this list, Platfora was founded in order to simplify Hadoop. While businesses have been rapidly adopting Apache Hadoop as a scalable and inexpensive solution to store massive amounts of data, they struggle to extract meaningful value from that data. The Platfora solution masks the complexity of Hadoop, which makes it easier for business analysts to leverage their organization’s myriad data.

Platfora tries to simplify the data collection and analysis process, automatically transforming raw data in Hadoop into interactive, in-memory business intelligence, with no ETL or data warehousing required. Platfora provides an exploratory BI and analytics platform designed for business analysts. Platfora gives business analysts visual, self-service analytical tools that help them navigate from events, actions, and behaviors to business facts.

Customers include Comcast, Disney, Edmunds.com and the Washington Post.

Competitive Landscape: Platfora competes with the likes of Datameer, Tableau, IBM, SAP, SAS, Alpine Data, and Rapid-I.

Key Differentiator: Platfora claims to have the first scale-out in-memory Big Data Analytics platform for Hadoop. Platfora’s focus on simplifying Hadoop and Big Data analysis is becoming a more common goal of late, but they are an early mover in this respect.

2. Alpine Data Labs

What They Do: Provide a Hadoop-based data analysis platform.

Headquarters: San Francisco, Calif.

CEO: Joe Otto, formerly senior vice president of sales and service at Greenplum.

Founded: 2010

Funding: $23.5 million in total funding, including $16 in Series B Funding, from Sierra Ventures, Mission Ventures, UMC Capital and Robert Bosch Venture Capital.

Why They’re on This List: Most executives and managers don’t have the time or skills to code in order to glean data insights, nor do they have the time to learn about complex new infrastructures like Hadoop. Rather, they want to see the big picture. The trouble is that complex advanced analytics and machine learning typically require scripting and coding expertise, which can limit access to data scientists. Alpine Data mitigates this issue by making predictive analytics accessible via SaaS.

Alpine Data provides a visual drag-and-drop approach that allows data analysts (or any designated user) throughout an organization to work with large data sets, develop and refine models, and collaborate at scale without having to code. Data is analyzed in the live environment, without migrating or sampling, via a Web app that can be locally hosted.

Alpine Data leverages the parallel processing power of Hadoop and MPP databases and implements data mining algorithms in MapReduce and SQL. Users interact with their data directly where it already sits. Then, they can design analytics workflows without worrying about data movement. All this is done in a Web browser, and Alpine Data then translates these visual workflows into a sequence of in-database or MapReduce tasks.

Customers include Sony, Havas Media, Scala, Visa, Xactly, NBC, Avast, BlackBerry, and Morgan Stanley.

Competitive Landscape: Alpine will compete both with large incumbents (SAS, IBM, SPSS, and SAP) and such startups as Nuevora, Platfora, Skytree, Revolution Analytics, and Rapid-I.

Key Differentiator: Alpine Data Labs argues that most competing solutions are either desktop-based or a point solutions without any collaborative capability. In contrast, Alpine Data offers a “SharePoint-like” feel to it. On top of collaboration and search, it also provides modeling and machine learning under the same roof. Alpine is also part of the No-Data-Movement camp. Regardless if a company’s data is in Hadoop or MPP Database, Alpine sends out instructions, via its In-Cluster Analytics, without ever moving data.

3. Altiscale

What They Do: Provide Hadoop-as-a-Service (HaaS).

Headquarters: Palo Alto, Calif.

CEO: Raymie Stata, who was previously CTO of Yahoo.

Founded: March 2012

Funding: Altiscale is backed by $12 million in Series A funding from General Catalyst and Sequoia Capital, along with investments from individual backers.

Why They’re on This List: Hadoop has become almost synonymous with Big Data, yet the number of Hadoop experts available in the wild cannot hope to keep up with demand. Thus, the market for HaaS should rise in step with big data. In fact, according to TechNavio, the HaaS market will top $19 billion by 2016.

Altiscale’s service is intended to abstract the complexity of Hadoop. Altiscale’s engineers set up, run, and manage Hadoop environments for their customers, allowing customers to focus on their data and applications. When customers’ needs change, services are scaled to fit — one of the core advantages of a cloud-based service.

Customers include MarketShare and Internet Archive.

Competitive Landscape: The HaaS space is heating up. Competitors comes from incumbents, such as Amazon Elastic MapReduce (EMR), Microsoft’s Hadoop on Azure, and Rackspace’s service based on Hortonworks’ distribution. Altiscale will also compete directly with Hortonworks and with such startups as Cloudera, Mortar Data, Qubole, and Xpleny.

Key Differentiator: Altiscale argues that they are “the only firm to actually provide a soup-to-nuts Hadoop deployment. By comparison, AWS forces companies to acquire, install, deploy, and manage a Hadoop implementation — something that takes a lot of time.”

4. Trifacta

What They Do: Provide a platform that enables users to transform raw, complex data into clean and structured formats for analysis.

Headquarters: San Francisco, Calif.

CEO: Joe Hellerstein, who in addition to serving as Trifacta’s CEO is also a professor of Computer Science at Berkeley. In 2010, Fortune included him in their list of 50 smartest people in technology, and MIT Technology Review included his Bloom language for cloud computing on their TR10 list of the 10 technologies “most likely to change our world.”

Founded: 2012

Funding: Trifacta is backed by $16.3 million in funding raised in two rounds from Accel Partners, XSeed Capital, Data Collective, Greylock Partners, and individual investors.

Why They’re on This List: According to Trifacta, there is a bottleneck in the data chain between the technology platforms for Big Data and the tools used to analyze data. Business analysts, data scientists, and IT programmers spend an inordinate amount of time transforming data. Data scientists, for example, spend as much as 60 to 80 percent of their time transforming data. At the same time, business data analysts don’t have the technical ability to work with new data sets on their own.

To solve this problem, Trifacta uses “Predictive Interaction” technology to elevate data manipulation into a visual experience, allowing users to quickly and easily identify features of interest or concern. As analysts highlight visual features, Trifacta’s predictive algorithms observe both user behavior and properties of the data to anticipate the user’s intent and make suggestions without the need for user specification. As a result, the cumbersome task of data transformation becomes a lightweight experience that is far more agile and efficient than traditional approaches. Lockheed Martin and Accretive Health are early customers.

Competitive Landscape: Trifacta will compete with Paxata, Informatica and CirroHow.

Key Differentiator: Trifacta argues that the problem of data transformation requires a radically new interaction model — one that couples human business insight with machine intelligence. Trifacta’s platform combines visual interaction with intelligent inference and “Predictive Interaction” technology to close the gap between people and data.

5. Splice Machine

What They Do: Provide a Hadoop-based, SQL-compliant database designed for big data applications.

Headquarters: San Francisco, Calif.

CEO: Monte Zweben, who previously worked at the NASA Ames Research Center where he served as the Deputy Branch Chief of the Artificial Intelligence Branch. He later founded and served as CEO of Blue Martini Software.

Founded: 2012

Funding: They are backed by $19 million in funding from Interwest Partners and Mohr Davidow Ventures.

Why They’re on This List: Application and Web developers have been moving away from traditional relational databases due to rapidly growing data volumes and evolving data types. New solutions are needed to solve scaling and schema issues. Splice Machine argues that even a few short months ago Hadoop, while viewed as a great place to store massive amounts of data, wasn’t ready to power applications.

Now, with emerging database solutions, features that made RDBMS so popular for so long, such as ACID compliance, transactional integrity, and standard SQL, are available on top of the cost-effective and scalable Hadoop platform. Splice Machine believes that this enables developers to get the best of both worlds in one general-purpose database platform.

Splice Machine provides all the benefits of NoSQL databases, such as auto-sharding, scalability, fault tolerance, and high availability, while retaining SQL, which is still the industry standard. Splice Machine optimizes complex queries to power real-time OLTP and OLAP applications at scale without rewriting existing SQL-based apps and BI tool integrations. By leveraging distributed computing, Splice Machine can scale from terabytes to petabytes by simply adding more commodity servers. Splice Machine is able to provide this scalability without sacrificing the SQL functionality or the ACID compliance that are cornerstones of an RDBMS.

Competitive Landscape: Competitors include Cloudera, MemSQL, NuoDB, Datastax, and VoltDB.

Key Differentiator: Splice Machine claims to have the only transactional SQL-on-Hadoop database that powers real-time big data applications.

To read the full article:

http://www.cio.com/article/751572/10_Hot_Hadoop_Startups_to_Watch_?page=1&taxonomyId=3002

Posted in Data Analytics, Data Management, Development, IM Topics | Leave a comment

14 Things You Need to Know About Data Storage Management

If you think backing up files and software to a storage device or to the cloud will automatically preserve and protect them (and your organization), think again. Data storage and management experts discuss what steps you need to take to properly manage and store data — and why just backing up data is not enough.

“When it comes to storing data, there is no ‘one-size-fits-all’ solution,” says Orlando Scott-Cowley, Messaging, Security and Storage Evangelist at Mimecast, a cloud and mobile data storage and security provider.

Before you decide where or how you will store your structured and unstructured data, “companies first need to understand the amount and type of data they have along with the motivation behind storing the information,” Cowley says. “Having this background will help determine what route to take, whether building on-premise solutions or moving to the cloud,” or some combination of the two.

So how do you formulate that sound data storage management strategy? CIO.com asked dozens of storage and data management experts, which resulted in these top 14 suggestions regarding what steps you need to take to choose the right data storage solution(s) for your organization — and how you can better ensure your data is properly protected and retrievable.

1. Know your data. “All data is not created equal — and understanding the business value of data is critical for defining the storage strategy,” says

Souvik Choudhury, senior director, Product Management at SunGard Availability Services. So when formulating your data storage management policy, ask the following questions:

  • How soon do I need the data back if lost?
  • How fast do I need to access the data?
  • How long do I need to retain data?
  • How secure does it need to be?
  • What regulatory requirements need to be adhered to?

2. Don’t neglect unstructured data. “Think about how you might want to combine multi-structured data from your transactional systems with semi-structured or unstructured data from your email servers, network file systems, etc.,” says Aaron Rosenbaum, director, Product Management, MarkLogic, a database solution provider. “Make sure that the data management platform you choose will let you combine all these types without months or years of data modeling effort.”

3. Understand your compliance needs. “If you are a publicly traded company or operating within a highly regulated industry such as financial services or healthcare, the bar has been set high for compliance and security,” says Jay Atkinson, CEO of cloud hosting provider AIS Network.

“If you choose to outsource your data storage and management, ensure that your managed services provider has the credentials needed to provide a highly secure, compliant environment. Failure to operate in total compliance may lead to severe penalties later,” says Atkinson

4. Establish a data retention policy. “Setting the right data retention policies is a necessity for both internal data governance and legal compliance,” says Chris Grossman, senior vice president, Enterprise Applications, Rand Worldwide and Rand Secure Archive, a data archiving and management solution provider. “Some of your data must be retained for many years, while other data may only be needed for days.”

“When setting up processes, identify the organization’s most important data and prioritize storage management resources appropriately,” says Scott-Cowley. “For example, email may be a company’s top priority, but storing and archiving email data for one particular group, say the executives, may be more critical than other groups,” he says. “Make sure these priorities are set so data management resources can be focused on the most important tasks.”

5. Look for a solution that fits your data, not the other way around. “Many think the only choice to make is whether they need DAS, a SAN or a NAS,” says Olivier Thierry, chief marketing officer at Pivot3, the provider of converged, highly available shared storage and virtual server appliances. “These are important choices, but they are insufficient,” he continues.

“While a Fibre Channel SAN may be great for doing a lot of low latency read/write operations on a fairly structured database, it’s not typically designed to do well on spikey unstructured video workloads,” Thierry says. So “instead of selecting a one-size-fits-all strategy, smarter buyers are now considering the workload characteristics and picking the right storage strategy for the job.”

Similarly, “look for a solution that provides the flexibility to choose where data is stored: on premise and/or in the cloud,” says Jesse Lipson, founder of ShareFile and VP & GM of Data Sharing at Citrix. “The solution should allow you to leverage existing investments in data platforms such as network shares and SharePoint.”

And if like many businesses these days you have a mobile workforce, the data management and storage solution you choose “should be optimized for mobile and virtual platforms, in addition to desktops and laptops — and provide a consistent experience across any platform, including mobile editing capabilities and intuitive experience across mobile devices, virtual desktops or desktops.”

6. Don’t let upfront costs dictate your decision. “The real cost of storage comes from operating the solution over several years,” says Antony Falco, cofounder and CEO of Orchestrate.io. So “make sure you really understand your operating costs [or total cost of ownership]: personnel, third-party support, monitoring, even the chance you’ll lose data, which certainly carries a cost,” he says. “These all quickly dwarf the upfront costs to purchase and deploy.”

“Many users buy storage (systems or services) because of large initial discounts or they neglect to think through the costs of their chosen storage years down the road,” adds Jon Hiles, senior product manager at storage solution provider Spectra Logic.

To read the full article:

http://www.cio.com/article/739499/14_Things_You_Need_to_Know_About_Data_Storage_Management?page=1&taxonomyId=3037

Posted in Data Management, Data Mining, IM Topics | Leave a comment

Why CIOs Should Look Outside for Data Expertise

We’re all familiar with the traditional outsourcing model of using external service providers to handle IT infrastructure and maintenance in hopes of cutting costs and freeing in-house IT staff to focus on higher-value activities unique to the business. But something similar is happening with data analytics: Companies are starting to supplement their in-house analytics capabilities by using external providers.

At a recent meeting of the Society for Information Management’s Advanced Practices Council, researchers Gabe Piccoli (University of Pavia) and Federico Pigni (Grenoble Ecole de Management) described a startup called Versium–one example of a company offering this new breed of data-analytics services. Versium can combine a business’s own data with Versium’s collection of customer data and apply predictive analytics to better understand, find and retain customers.

Versium’s data warehouse has over 300 billion online and offline observations about consumers, such as purchase interests, social-media behavior, demographics, education level, family status, financial rating and life changes that might trigger new purchases. These attributes are combined with enterprise data to produce predictive scores and consumer intelligence.

Predictive scores include fraud scores (who is trying to scam us?), churn scores (who is most likely to cancel?), social influencer scores (which customers affect peers’ behavior?), wealth scores (what is the predictive buying power of my consumers?), shopper scores (who are discount shoppers vs. full price?), and recommendation scores (which offers should be sent to which consumers?).

At the council meeting, Barbara Wixom, an expert in business intelligence at MIT’s Center for Information Systems Research, offered other examples of companies getting data and analytics from external providers–either while they build their internal capacity or in lieu of doing so. She cited the rental-car company Hertz, which supplements its in-house analytics resources and data warehouse with external services.

Hertz outsources the selection and provision of non-Hertz data, as well as the processes of modeling and cleansing data, hosting and managing data, and gleaning insights from that data. These outsourced capabilities allowed Hertz to let customers swap or upgrade their reserved rental car on their mobile phones.

Hertz also supplements its in-house capabilities with software from IBM and Mindshare Technologies for a “voice of the customer” analytics system that examines thousands of comments from Web surveys, emails and text messages so the company can quickly pinpoint and resolve customer problems.

Using a set of linguistic rules, the system automatically categorizes comments with descriptive tags like “vehicle cleanliness,” “staff courtesy” and “mechanical issues,” freeing location managers from having to tag them manually. The system also flags customers who request a callback from a manager or who mention Hertz’s customer loyalty program.

Analytics outsourcing can speed up the delivery of new services, provide access to advanced technology, and give access to data scientist skills that are notoriously hard to acquire and retain in-house. And it could be a short-term solution while the company figures out its data-analytics strategy of the future. But be careful: You’ll need contract terms that protect competitive information and practices.

To read the full article:

http://www.cio.com/article/749877/Why_CIOs_Should_Look_Outside_for_Data_Expertise?page=2&taxonomyId=600010

Posted in Data Analytics, Data Management, IM Topics | Leave a comment

How to Use Service Catalogs to Combat Cloud Sprawl

Cloud usage is growing dramatically, but unfortunately some of that growth is the result of employees going around IT and obtaining services directly, resulting in cloud sprawl. Service catalogs can help you get your arms around cloud services, regain control of business processes, and enable you to better serve business users.

To get service catalogs right, however, you need to take into account a host of critical factors: scalability, manageability, security and profitability.

* Scalability: You have to keep up. It is critical to select a service catalog that’s easy to manage and for customers to use. For example, a user may at first only need a server. However, additional needs related to that server can quickly pop up, sending the user looking to add firewalls, applications, security and back up. The service catalog needs to be flexible enough to accommodate the additional offerings and make them available in a timely manner.

The unfortunate reality is that most service catalogs require development resources to make changes, which often delays the rolling out of new offerings. This presents challenges as business needs develop, forcing many users to go outside of IT to secure the services and offerings they need. A

A good IT service catalog won’t require an army of javascript coders to manage it. By providing a framework for managing cloud offerings with drag and drop functionality — and eliminating the need for coding — it’s easier to add new services. This empowers business owners — who best understand the business problems — to be involved in developing and maintaining the service catalog.Once IT illustrates it can deliver cloud services and offerings quickly and cost effectively, business users will be more likely to enlist internal resources for their needs, reducing renegade cloud sprawl.

* Security: Heightened in the Cloud. A Here’s an example of a typical scenario that can lead to a major security breach: Marketing teams often prefer to run large campaigns on cloud servers. But if the IT team can’t deliver a cloud server quickly and cost effectively, the marketing team is likely to go directly to an outside service supplier to keep the project on time. What often gets overlooked? A The proper security measures, meaning customer data can be compromised with potential legal ramifications. This is a disaster waiting to happen, and IT will ultimately be called in to handle the cleanup, even though it’s a mess they didn’t create.

When IT is involved in securing cloud services, they ensure that sensitive and confidential information moving to the cloud is protected. By ensuring that customers route their cloud service needs through a service catalog, IT stays in control of security measures that a business user will not normally consider.

* Managing Cloud Services: Get down to Earth. You need to recognize cloud computing is here to stay and let employees know you understand and are willing to work with them, which encourages an open dialogue. Educating users about the issues cloud services can pose is also key. Start the conversation about cloud services, explain the perils inherent in the solutions, and then assure users IT is working to improve response time. That will lay the foundation for an improved relationship. From there, IT must deliver on the promise.

A service catalog provides the mechanism for IT to improve response time, as well as an equally effective way for employees to secure and track cloud service orders they place. By providing visibility into the status of orders, business users have confidence that IT has things under control.

* Profitability: Get Clear. If business users turn to an outside provider to get resources needed to meet a tight launch window they will often bill it to their expense account, not the IT budget. Because of this, cloud sprawl is typically buried in line items on various expense reports, resulting in a hidden IT expense for which the company lacks visibility.

Additionally, because no one is really “managing” this cloud resource, it may be left up and running via recurring payments (and costing the company money) long after its intended use. Plus, employees tend to “over buy”– purchasing more resources than really needed.

As a result, companies do not have a clear understanding of the total cost of cloud expenditures. Working with business users to develop an effective strategy, IT gains visibility into purchases and can ensure the right technology is in place in timely fashion to minimize expenses and maintain visibility of costs.

Understand the Big Picture

The first step in eliminating cloud sprawl is to recognize the problem exists. The next step is to take action and get full clarity of cloud usage and spend. Once usage and costs are understood, you will have a framework for how the business is currently using IT services across the board, allowing you to create a system for managing sprawl and expenses.

Implementing an enterprise service catalog provides you with visibility into cloud spend and control expenses. An enterprise service catalog also provides a way to bill services back to the department receiving the services and a window into showback, chargeback and complex costing functionality.

Posted in Cloud Computing, Development, IM Topics | Leave a comment

Contextual Inquiry: A Must-Have Method For Your User Research Toolbox

This session will feature two user researchers from a Boston-area company who used Contextual Inquiry to unlock the secrets of physician workflows, starting with just a single question. They’ll talk about how to prepare for a project using Contextual Inquiry, how to engage users and how to distill high-quality qualitative data into meaningful takeaways.

Title: Contextual Inquiry: A Must-Have Method For Your User Research Toolbox

Date: Wednesday, April 16, 2014

Time: 2:00 PM – 3:30 PM EDT

Space is limited. Reserve your Webinar seat now at:

http://www.asis.org/Conferences/webinars/USEWebinar-4-16-2014-register.html

After registering you will receive a confirmation e-mail containing information about joining the Webinar.

System Requirements

PC-based attendees

Required: Windows® 8, 7, Vista, XP or 2003 Server

Mac®-based attendees

Required: Mac OS® X 10.6 or newer

Mobile attendees

Required: iPhone®, iPad®, Android™ phone or Android tablet

Posted in Data Analytics, IM Topics, UX Design/User Experience, Webinar | Leave a comment

6 Tips to Build Your Social Media Strategy

With so many social media options, how do you pick the best one(s)? IT executives and social media experts share their top six tips for selecting the social media platforms that will provide the greatest return on your investment of time and resources.

1. Identify Specifically What Want to Accomplish Via Social Media

“To find out what the best channel is for your social media outreach, you first need to define what your business goals are–i.e., focusing ontop-of-the-funnel KPIs like extending your brand recognition or bottom-of-the-funnel KPIs such as lead form submissions or ecommerce purchases,” says Lauren Fairbanks, chief content strategist, Stunt & Gimmick’s, a creative content agency.

“For example, if you want to improve your organic search rankings, then Google+ and YouTube can help,” says Martin Wong, CMO SMARTT, a Web design and digital marketing firm. “If your intent is to provide customer support over social media, then it makes sense to do so over Facebook and Twitter,” he says.

Or maybe you want to use social media as part of your customer service and support efforts. “Social strategies aren’t only for marketing,” says Kristin Muhlner, CEO,newBrandAnalytics. So if customer services and support is one of your top goals, “before choosing your top platforms, conduct an audit of social channels that help you support your overall customer service and communications strategy in a manageable way,” she says.

2. Figure Out Where Your Customers Are

“When deciding on which social media service is best for your business, you need to determine the social networks that your customers are using,” says Shane Gamble, marketing coordinator and ecommerce advisor, Sweet Tooth, a customer loyalty and rewards service. “For example, if your target market is women aged 25 to 34, it would be wise to have an active presence on Pinterest.”

Similarly, “if your products are consumer-oriented, use Pinterest and Facebook,” says Becky Boyd, vice president, MediaFirst, a technology PR agency. “If you offer more B2B solutions, LinkedIn (and LinkedIn Groups that would interest your audience) and Twitter are best.” And if you want to demonstrate something visually, use YouTube.”

How can you determine which social media sites your existing and prospective customers use? By “using existing CRM, social metrics, Web analytics and customer surveys,” says Brad Lowrey, digital manager at public relations agency Weber Shandwick. “There is no sense putting in resources and effort being on a platform like Twitter or Pinterest if only five percent of your customers are active there.”

3. Choose a Site that Provides a Good Platform for the Type of Information You Want to Share

“Pinterest and Instagram are excellent for anyone who has a lot of visual content to share: fashion, home design, etc.,” explains Alexandra Golaszewska, the owner ofAlexandraGo. If you want to connect with other businesses, and provide B2B content, she suggests you use LinkedIn. If you are a retail business and want to reach customers via specials and promotions, consider Foursquare.

4. Look for Social Media Services That are Mobile-Friendly

“Over one billion smartphone users and counting means that mobile UX is more important than ever,” says Muhlner. “Put your resources into social platforms with geolocation services that your users can interact with anywhere, anytime, and you’ll have a direct channel to them 24/7.”

To read the full article:

http://www.cio.com/article/732975/6_Tips_to_Build_Your_Social_Media_Strategy?page=1&taxonomyId=3119

Posted in IM Topics, Social Computing | Leave a comment

Behind the Scenes of Really Big Data: Computing on the Whole World

For our April Meetup, we’re thrilled to have Kalev Leetaru, Yahoo! Fellow in Residence at Georgetown University, talk about data mining at a global-scale. What does it take to build a system that monitors the entire world, analyzing global newsmedia in realtime, compiling catalogs of everything happening in the world and makes that data accessible for analysis, visualization, forecasting, and operational use? What does it take to support querying of a quarter-billion-record-by-58-column database in near-realtime? How do you visualize networks with hundreds of millions of nodes, tease structure from chaotic real-world observational graphs, or explore networks in the multi-petabyte range? How do you process and geographically visualize the emotion of the live Twitter Decahose in realtime? How do you rethink tone mining from scratch to power a flagship new reality television show? How do you adapt systems to work with machine translation, OCR and closed captioning error, and the messiness of real-world data? How do you process half a million hours of television news, five billion pages of historic books, or 60 million images dating back 500 years?

• 6:30pm — Networking, Empenadas, and Refreshments

• 7:00pm — Introduction

• 7:15pm — Presentation and Discussion

• 8:30pm — Data Drinks (tba)

Abstract:

This talk will pull back the curtain and present a behind-the-scenes view of what its really like to work with really big data. How does one blend the world’s most powerful supercomputers, virtual machines, cloud storage, infrastructure as a service, plus a ton of software, into a single end-to-end environment that supports all of this research? I’ll be deep-diving on the GDELT Project, a catalog of human societal-scale behavior and beliefs across all countries of the world, connecting every person, organization, location, count, theme, news source, and event across the planet into a single massive network that captures what’s happening around the world, what its context is and who’s involved, and how the world is feeling about it, every single day. What does it take to build and run a system that monitors the entire world each day and delivers a quantitative model that increasingly powers operational conflict watchboards across the world?

Bio:

Kalev H. Leetaru is the 2013-2014 Yahoo! Fellow in Residence for International Values, Communications Technology and the Global Internet at the Institute for the Study of Diplomacy in the Edmund A. Walsh School of Foreign Service  at Georgetown University. He holds three US patents (cited by a combined 44 other issued US patents) and his work has been profiled in Nature, the New  York Times, The Economist, BBC, Discovery Channel and the media of more than 100 countries. His most recent work includes the first in-depth study of the geography of social media and the changing role of distance and location in online communicative behavior around the world (named by Harvard’s Nieman Lab as the top social media study of 2013), the creation of the GDELT Project, a database of more than a quarter-billion georeferenced global events 1979-present and the people, organizations, locations, and themes connecting the world, and the creation of the SyFy Channel’s Twitter Popularity Index, the first realtime character “leaderboard” created for television. Most recently he was named as one of Foreign Policy Magazine’s Top 100 Global Thinkers of 2013. More on his latest projects can be found on his website at http://www.kalevleetaru.com/.

Sponsors:

This event is sponsored by the GWU Dept. of Decision SciencesClouderaStatistics.comIBM Analytics Solution Center, Elder Research, and InformIT. Would you like to sponsor too? Please get in touch!

Posted in Events, Information Sessiona, Speakers | Leave a comment