Results 1 to 3 of 3

Thread: High Performance Computing

  1. #1

    Default High Performance Computing

    Hi All
    i am going to build high performance computing HPC
    specifically for data mining and analytic
    what is the tool i will going to used
    hortonworks for data collection and vertica DB
    still analytic tool not selected
    what i want to do is
    what is the software i should use to build high performance computing based on open suse
    is there any reference for that
    any manual

    for Rocks and OSCAR really outdated and no improvement has been done from long time

  2. #2
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,286
    Blog Entries
    2

    Default Re: High Performance Computing

    You'll probably want to define exactly what you want to do, then build according to the requirements of your solution.

    Some generalities about openSUSE...
    - Unlike RHEL, openSUSE does not make available a kernel specially tuned for HPC, but whether that will make a diff to you is YMMV. Just deploy the Default kernel instead of the default Desktop kernel to remove the workstation type kernel optimizations.

    - Unlike most distros, openSUSE typically makes everything imaginable accessible from the same few repos, with options to add repos for specific reasons. It does not matter what Desktop you may or may not choose, you can install apps for any Desktop or machine configuration.

    The two leading solutions for high performance data analysis, mining and search today are the most popular Hadoop/Pig/Hive/etc and Elasticsearch/Logstash/Kibana application stacks.

    Personally, I work on an Elasticsearch cluster based almost entirely on openSUSE.
    Elasticsearch is competitive application stack to the Industry standard Hadoop/Pig/Hive stack with many similarities. Since ES's launch about 4 years ago, these 2 stacks have gone head to head leapfrogging each other introducing new features and adopting best from the other.
    The most important HPC feature of these two app stacks are that they both are based on noSQL application level clusters which means that
    - There is no limit to their capacity, just add another node
    - There is no complex OS or system level clustering, all clustering is done at the application level, ie nodes discovery, data and metadata distribution across nodes, data fault tolerance, node failure and recovery, management in general. For those who have lived through configuring clusters, heartbeats and more, this is an existential dream.
    - Both require Java re-optimization for high performance, not needed if you pushing the performance envelope.

    The reasons I use Elasticsearch instead of the traditional Hadoop/et al is because
    - No need to learn a half dozen languages for each app in the stack. JSON is the standard used for data storage, communication between nodes and configurations.
    - Logstash is a very cool aggregator, parser, router and data transformer.
    - Kibana is a web frontend, although I've been looking at other web frontends including the ever-popular Graphite.

    Unfortunately I haven't kept up my ES on openSUSE writings, everything is very ancient and may not work, but if you want to read what I've posted it might still provide a flavor for major parts of the stack, how to invoke (those are still mostly current) and installation (I recommend only the repo method today, or you can simply read the most current documentation at https://www.elastic.co/).
    https://en.opensuse.org/User:Tsu2#Logging_and_Big_Data

    HTH,
    TSU

  3. #3
    Join Date
    Jun 2008
    Location
    San Diego, Ca, USA
    Posts
    11,286
    Blog Entries
    2

    Default Re: High Performance Computing

    For anyone who wants to install Elasticsearch, I have updated my wiki page describing how to install using the elastic repos.
    https://en.opensuse.org/User:Tsu2/el...official_repos

    The pages I've written about ES may be considered far superior to what you'll find about openSUSE on the elastic.co website.
    Still, there is a tremendous amount of information on the elastic.co website no covered by my writings. If you are unable to make anything work, just post your question.

    IMO,
    TSU

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •