Wednesday, November 5, 2014

BIG-DATA-TECH BLOGS

It's been long long time I have blogged and wanted to get heads up on things.
Below are the few technical blogs that I follow on everyday basis to improve my BIG DATA knowledge in the field that I work:


http://techblog.netflix.com/
https://blog.twitter.com/engineering
https://developers.facebook.com/


You are always welcome to comment if you feel I missed few amazing blog. Would surely add them to the list.

Monday, October 15, 2012

Setting up HBase on Windows x64 VM



Yes, there is an “official” guide to HBase installation for Windows, but it seems to be written for older versions of HBase. Some steps are not necessary anymore, but on the other hand, there are some steps that weren’t mentioned, but are crucial (like the ZooKeeper stuff).
This tutorial will guide you through the HBase installation which is based on the Cygwin in a way that is similar to the official guide. I have tested this on Windows 7, 64bit.

Downloading Cygwin

  1. download cygwin setup.exe and run it
  2. choose an appropriate mirror
     I will assume that Cygwin will be installed into C:\Programs\CygwinDo not install Cygwin into a folder that contains a space character (C:\Program Files). If you do so, you will face many random and unexpected troubles.
  3. from packages, choose the following:
    • OpenSSH,
    • tcp_wrappers,
    • diffutils [this should be pre-selected],
    • zlib
  4. proceed with installation until it is finished.

Configuring Cygwin

  1. run CygWin Bash Shell with Administrator privileges (C:\cygwin\Cygwin.bat)
  2. from this Bash shell run ssh-host-config
    • say “yes” to privilege separation
    • say “yes” to create the sshd account
    • say “yes” to install sshd as a service
    • press to enter an empty value of CYGWIN for the daemon
    • Now Cygwin needs to create a new account that will be used as a “proxy”/setuid origin account. Say “no” to use the default name (cyg_server).
    • say “yes” to create a new privileged account cyg_server.
    • create a password for this new privileged account and confirm it
  3. synchronize Windows user accounts with Cygwin user accounts:
    mkpasswd -cl > /etc/passwd
    mkgroup --local > /etc/group
    
  4. start SSH server with net start sshd
  5. test connection with ssh localhost from Cygwin Bash Shell.
    • say “yes” to check and store server fingerprint
    • put your Windows account password to authenticate
    • issue a few test commands in the remote session
    • close session with exit.
  6. alternatively: test your SSHD with putty.

Configuring HBase

  1. I assume that you have Java JDK installed (if not, it’s time to do that now.) However, I assume that Java is installed into a file without spaces in the name. (Again, noC:\Program Files\Java.). If you have a previous Java installation with a space-using filename, reinstall it now.
  2. Download HBase from Apache Site. Unpack it into an appropriate folder. I assume this should be C:\java\hbase.
  3. Open ./conf/hbase-env.sh in HBase directory
    • uncomment and modify this line so it reads:
      export JAVA_HOME=/cygdrive/c/java/jdk7
      
    • uncomment and modify this line so it reads:
      export HBASE_CLASSPATH=/cygdrive/c/java/hbase/lib/zookeeper-3.4.3.jar 
      
  4. Copy ./src/main/resources/hbase-default.xml to ./conf
  5. Open ./conf/hbase-default.xml in HBase directory
    • Change hbase.rootdir to /tmp
       This will resolve into C:\tmp on Windows. We will create it later.
    • Change hbase.tmp.dir to C:/programs/cygwin/root/tmp/hbase/tmp
       This also assumes that Cygwin is installed intoC:\programs\cygwin.
    • If you have a computer that has no domain name, then determine your hostname: either by running hostname from shell or from System Properties | Computer Name tab. For example, my PC has hostname rn-PC.
    • Change hbase.zookeeper.quorum to rn-PC instead of localhost
       Windows 64-bit seems to have trouble resolving localhost to127.0.0.1.
    • Change hbase.defaults.for.version.skip to true instead of false
       This will disable weird version warnings. We are actually running HBase from “uncompiled” source tree, therefore some config files get unprocessed. Despite the fact that HBase is being built by Maven, it is heavily depending on Linux tools and building requires lots of hacking. Fortunately, it is not necessary.
  6. Create the appropriate directories. Execute this from Cygwin Bash Shell:
    mkdir -pv /root/tmp/hbase/data
    mkdir -pv /cygdrive/c/tmp
    
  7. Grant the appropriate rights
    chmod 777 /root/tmp/hbase/data
    chmod 777 /cygdrive/c/tmp
    

Running HBase

  1. Within Bash, change dir to
    cd /cygdrive/c/java/hbase
    
  2. Run
    ./bin/start-hbase.sh/
    
  3. Enter password twice and HBase should start. On the first run, you may be prompted for the SSH fingerprint mismatch — in that case, just confirm with “yes”. Ideally, the console should show:
    $ ./bin/start-hbase.sh
    rn@127.0.0.1's password:
    127.0.0.1: starting zookeeper, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-zookeeper-rn-PC.out
    starting master, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-master-rn-PC.out
    rn@localhost's password:
    localhost: starting regionserver, logging to /cygdrive/c/java/hbase/bin/../logs/hbase-rn-regionserver-rn-PC.out
    
  4. In case of failure, check the log files (see the C:\java\hbase\log).
  5. HBase can be stopped with
    ./bin/stop-hbase.sh.
    
    Note that you should wait for the stopping of the server (it may take a long time), otherwise you risk data corruption.

Using HBase

  1. Start Hbase server.
    ./bin/start-hbase.sh/
  2. Start Bash and start the HBase Shell:
    ./bin/hbase shell
    
  3. Create a simple table:
    create 'test', 'data'
    
  4. Verify that the table has been created
    list
    
  5. Insert some data:
    put 'test', 'row1', 'data:1', 'value1'
    
  6. List all rows in the table
    scan 'test'
    
  7. Optionally, drop table
    disable 'test'
    drop 'test'
    
  8. You can leave the HBase shell with exit.

Tuesday, October 2, 2012

Big Data Had Ooop!!!

If your a beginner and planning to learn more about big data and hadoop below are the few tutorials I recommend and I found interesting:

Books :
Hadoop: The Definitive Guide

Links:
++ Beginner Hadoop ++ 
http://www.cloudera.com/protected/?resource=introduction-to-apache-mapreduce-and-hdfs

http://nosqltapes.com/video/understanding-mapreduce-with-mike-miller



++ MapReduce Framework ++ 
Great 1 hour video introduction: http://nosqltapes.com/video/understanding-mapreduce-with-mike-miller

Read the famous 2004 paper from Google that kicked off the MapReduce revolution. This is a very readable paper that can be digested in about 2 - 3 hours:http://research.google.com/archive/mapreduce.html

Here's a 33 minute video on what kinds of simple things you can do with MapReduce: 
http://www.cloudera.com/videos/mapreduce_algorithms

Google's MapReduce course: 
http://code.google.com/edu/parallel/mapreduce-tutorial.html


These are few I have read, I will keep adding links to this as and when I get a good article that catches my wink.

Thursday, April 12, 2012

MACRUBY--- It's just ruby mac.. :)

Recently I read when I saw "MACRUBY" in hacker's shelf, I couldn't resist myself to pick one of these books.Have look about in the link below:
http://hackershelf.com/book/69/macruby/
  
When I started going through this book I felt , I was reading Nu Language details one more time... but it's different trust me. Don't loose me here.I love to code in ruby. I have used Ruby cocoa in the past. Mac Ruby is better in other words as u have the best of Ruby 1.9 + Objective C. Its the best for native OS problems.Learning MacRuby does not require you to rem Object C(Atleast I saw the last Object C code in year 2007!!! so i was like OBJECT C.. I don't think I rem anything)

Best things about MacRuby is it uses YARV(Yet Another Ruby VM) which is a byte code interpreter.. performance time is great in this VM and not MAtz ruby interpreter.Time difference is like 16 sec : 5 sec. 

MacRuby Threads implementation is way beyond ruby as they are native thread and unlike ruby which was impossible to call back from p-threads.This can be done pretty easily using MacRuby.So where is Object C popping in... :) When We look into the performacnec scale of the garbage collector of ruby it is veryyyyy slow, Mac Ruby uses Object C for the rescue.The new Objective-C garbage collector engine, due to its generational nature, performs fast collections. It also doesn't stop the world while collecting memory, because collections are done in a separate thread.In MacRuby, all Ruby classes and objects are actually Objective-C classes and objects. There is no need to create costly proxies, convert objects, and cache instances. A Ruby object can be cast (toll-free) at the C level as an Objective-C object. The Ruby VM can also handle incoming Objective-C objects without conversion.
In MacRuby, the primitive Ruby classes (e.g., String, Array, and Hash) have been re-implemented on top of their Cocoa equivalents (respectively, NSString, NSArray, and NSDictionary). As an example, all strings in MacRuby are Cocoa strings, so they can be passed directly to underlying C or Objective-C APIs. It is also possible to call any method of the String interface on any Cocoa string, subclass Objective-C methods, etc.Need to test with audio codecs and image processing implementations.
Will post my video aggregation in twitter using MACRuby 0.10 shortly in My GIT account, keep following it.
Mac Ruby is nothing but RUBY 1.9 with MACOSX Framework On Mac






and it ROCKS!!!! 

Monday, September 5, 2011

RUBY ON RAILS Interview Questions

I wanted to post some interesting very few questions and answers on occasion of teachers day today!!!! Some the mandatory and interesting questions clubbed together in an interview Here I go... :). This is just not for jobs for your interest toooooo
1. Why Ruby on Rails?
Ans: There are lot of advantages of using ruby on rails(check the meaning of this in the book.)
  • DRY Principal(Dont Repeat Yourself)
  • Convention over Configuration
  • Gems and Plugins
  • Scaffolding
  • Pure OOP Concept
  • Rest Support
  • Rack support
  • Action Mailer
  • Rpc support
  • Rexml Support etc..

2. Explain about the programming language ruby?
Ans:Ruby is the brain child of a Japanese programmer Matz. He created Ruby. It is a cross platform object oriented language. It helps you in knowing what your code does in your application. With legacy code it gives you the power of administration and organization tasks. Being open source, it did go into great lengths of development.
.Ruby is a Dyanmic-type programming language and mainly referred to as Duck-typing.

3. Explain about ruby names?
Ans:Classes, variables, methods, constants and modules can be referred by ruby names. When you want to distinguish between various names you can specify that by the first character of the name. Some of the names are used as reserve words which should not be used for any other purpose. A name can be lowercase letter, upper case letter, number, or an underscore, make sure that you follow the name by name characters.

4. What is the Difference between Symbol and String?
Ans: Symbol are same like string but both behaviors is different based on object_id, memory and process time (cpu time) Strings are mutable , Symbols are immutable.But, testing two symbol values for equality (or non-equality) is faster than testing two string values for equality,

Mutable objects can be changed after assignment while immutable objects can only be overwritten.
For example:
p "string object jak".object_id #=> 22956070
p "string object jak".object_id #=> 22956030
p "string object jak".object_id #=> 22956090

p :symbol_object_jak.object_id #=> 247378
p :symbol_object_jak.object_id #=> 247378
p :symbol_object_jak.object_id #=> 247378

p " string object jak ".to_sym.object_id #=> 247518
p " string object jak ".to_sym.object_id #=> 247518
p " string object jak ".to_sym.object_id #=> 247518

p :symbol_object_jak.to_s.object_id #=> 22704460
p :symbol_object_jak.to_s.object_id #=> 22687010
p :symbol_object_jak.to_s.object_id #=> 21141310

Note : Each unique string value has an associated symbol

5. What is Session and Cookies?
Ans:
Session: are used to store user information on the server side.
Cookies: are used to store information on the browser side or we can say client side
Example: Session : say session[:user] = “jyotsnac” it remains when the browser is not closed

6. Difference between render and redirect?
Ans:
render example:
    render :partial
    render :new

  It will render the template new.rhtml without calling or redirecting to the new action.

redirect example:

 redirect_to :controller => ‘users’, :action => ‘new’

  It forces the clients browser to request the new action.
7. What is the Difference between Static and Dynamic Scaffolding?
Ans:
The Syntax of Static Scaffold is like this:
ruby script/generate scaffold User Comment
Where Comment is the model and User is your controller, So all n all static scaffold takes 2 parameter i.e your controller name and model name, whereas in dynamic scaffolding you have to define controller and model one by one.
8. How you run your Rails Application without creating database ?
Ans:
You can run application by uncomment the line in environment.rb

Path => rootpath conf/ environment.rb
# Skip frameworks you're not going to use (only works if using vendor/rails)

    config.frameworks -= [ :action_web_service, :action_mailer,:active_record ]

9. How to use sql db or mysql db. without defining it in the database.yml
Ans:
You can use ActiveRecord anywhere!

require 'rubygems'
require 'active_record'
ActiveRecord::Base.establish_connection({
:adapter => 'postgresql',
:user => 'foo',
:password => 'bar',
:database => 'whatever'
})

class Task <>
set_table_tame "a_legacy_thingie"

def utility_methods
update_attribute(:title, "yep")
end

end

Task.find(:first)

It’s ActiveRecord, you know what to do. Going wild:
ActiveRecord::Base.establish_connection(:adapter => "sqlite3",:dbfile => ":memory:")
ActiveRecord::Schema.define(:version => 1) do
create_table :posts do |t|
t.string :title
t.text :excerpt, :body
end

end

class Post <>
validates_presence_of :title
end

Post.create(:title => "A new post!")
Post.create(:title => "Another post",
:excerpt => "The excerpt is an excerpt.")
puts Post.count

10. What are helpers and how to use helpers in ROR?
Ans:
Helpers (“view helpers”) are modules that provide methods which are automatically usable in your view. They provide shortcuts to commonly used display code and a way for you to keep the programming out of your views. The purpose of a helper is to simplify the view. It’s best if the view file (RHTML/RXML) is short and sweet, so you can see the structure of the output.

11. What is Active Record?
Ans: Active Record are like Object Relational Mapping(ORM), where classes are mapped to table , objects are mapped to columns and object attributes are mapped to data in the table

12. Ruby Support Single Inheritance/Multiple Inheritance or Both?
Ans:
Ruby Supports only Single Inheritance.
You can achieve Multiple Inheritance through MIXIN concept means you achieve using module by including it with classes. A good blog to understand Mix-in concepts http://juixe.com/techknow/index.php/2006/06/15/mixins-in-ruby/

13. What is the naming conventions for methods that return a boolean result?
Ans:
Methods that return a boolean result are typically named with a ending question mark. For example: def active? return true #just always returning true end


14. What is the naming conventions for methods that return a boolean result?
Ans:
Methods that return a boolean result are typically named with a ending question mark. For example: def active? return true #just always returning true end

15. How do the following methods differ: @my_string.strip and @my_string.strip! ?
Ans:
The strip! method modifies the variable directly. Calling strip (without the !) returns a copy of the variable with the modifications, the original variable is not altered.
16. What's the difference in scope for these two variables: @name and @@name?
Ans:
@name is an instance variable and @@name is a class variable, where it is a single variable for all the instances of a class

17. What is the log that has to seen to check for an error in ruby rails?
Ans:
Rails will report errors from Apache in log/apache.log and errors from the Ruby code in log/development.log. If you're having a problem, do have a look at what these logs are saying. On Unix and Mac OS X you may run tail -f log/development.log in a separate terminal to monitor your application's execution.

18. What is the use of global variable $ in Ruby?
Ans:
A class variable starts with an @@ sign which is immediately followed by upper or lower case letter. You can also put some name characters after the letters which stand to be a pure optional. A class variable can be shared among all the objects of a class. A single copy of a class variable exists for each and every given class.
To write a global variable you start the variable with a $ sign which should be followed by a name character. Ruby defines a number of global variables which also include other punctuation characters such as $_ and $-k.

For example: If you declare one variable as global we can access any where, where as class variable visibility only in the class Example
class Test
def h
 $a = 5
 @b = 4

while $a > 0
puts $a
$a= $a - 1
end
end
end
test = Test.new
test.h
puts $a                    # 5
puts @b                   #nil

19. Default access specifier of default constructor in ruby?
Ans:Default access specifier is the same as the access specifier of the class. default constructor means the constructor with out arguments.

20. What is the use of super in ruby rails?
Ans:
Ruby uses the super keyword to call the superclass implementation of the current method


21. What is the difference between nil and false in ruby?
Ans:
False is a boolean datatype, Nil is not a data type it have object_id 4

22. How is class methods defined in Ruby?
Ans:
A:def self.methodname
--------
--------
end
or
def classname.methodname
--------
--------
end
or

class >> self {
def methodname
end
}

23. How is object methods defined in Ruby?
Ans:

class jak

def method1

--------
--------
end

end

obj=jak.new

It is single object
def obj.object_method_one
--------
--------
end

obj.Send(object_method_every)

It will be created every for every object creation

24. What are the operators available in Ruby?
Ans:
Something that’s used in an expression to manipulate objects such as + (plus), - (minus), * (multiply), and / (divide). You can also use operators to do comparisons,such as with <, >, and &&. 

25. What are the looping structures available in Ruby?
Ans: for..in
untill..end
while..end
do..end

Note: You can also use each to iterate a array as loop not exactly like loop
26. What are the object-oriented programming features supported by Ruby?

Ans:
Classes,Objects,Inheritance,Singleton methods,polymorphism(accomplished by over riding and overloading) are some oo concepts supported by ruby.

27. What is the scope of a local variable in Ruby?
Ans:
A new scope for a local variable is introduced in the toplevel, a class (module) definition, a method defintion. In a procedure block a new scope is introduced but you can access to a local variable outside the block.
The scope in a block is special because a local variable should be localized in Thread and Proc objects.

28. How is an iterator handled in Ruby?

Ans: Iterator is handled using keyword 'each' in ruby.
For example
number=[1,2,3]
then we can use iterator as
number.each do |i|
puts i
end
Above prints the values of an array $no which is accomplished using iterator.

29. How is visibility of methods changed in Ruby?
Ans: By applying the access modifier : Public , Private and Protected acces Modifier


30. What is the use of load and require in Ruby?
Ans: A method that loads and processes the Ruby code from a separate file, including whatever classes, modules, methods, and constants are in that file into the current scope. load is similar, but rather than performing the inclusion operation once, it reprocesses the code every time load is called.

31. Explain about class libraries in ruby?
Ans: Ruby has a strong set of class libraries and it covers from a variety of domains such as thread programming, domains and data types. Also ruby is a new language and it also has additional libraries coming every day. Many of the new languages which do exist have huge libraries because of their age.

32. Explain about portability?
Ans: Ruby language can be ported to many platforms. Ruby programs can be ported to many platforms without any modification to the source code. This feature made the language very useful and highly used by many programmers worldwide. Some of the platforms used are DOS, UNIX, WINDOWS, etc.