Data World

Exam 70-461 Notes – Set Theory

Posted by Pramod Singla on March 23, 2017


  1. Ordered list of returned result set is called Cursor.

  2. Without “order by” clause query doesn’t return results in particular order even if clustered index is there.

  3. Standard SQL must have “Select” and “have” whereas T-SQL can work with only “Select”

  4. “dbo”.”order” standard where as [dbo].[order] is Microsoft format.

  5. T-SQL is based upon SQL which in turn based upon relational model which further is based upon set theory and predicate logic.

         T-SQL<- SQL <- relational model<- set theory and predicate logic.

Advertisements

Posted in Certifications, MCSA 70-461 | Tagged: , | Leave a Comment »

Exam 70-461 Notes – Grouping Sets

Posted by Pramod Singla on March 23, 2017


  1. Data Analysis function can be either a group function or a window function.

  2. T-SQL Supports 3 clauses that allow define multiple grouping sets.These operators can generate the same result set as when you use UNION ALL to combine single grouping queries.Here and here are the Details.

    • ROLLUP

    • CUBE

    • Grouping Sets

  3. Rollup is used where hierarchy is formed in the input columns like for Country,Region,city.

  4. Grouping

    Grouping_ID

    Excepts one input and returns 0 if element is in grouping set else 1

    Excepts lists of columns as input and return a integer representing bitmap.

    GROUPING is used to distinguish the null values that are returned by ROLLUP, CUBE or GROUPING SETS from standard null values.

    https://msdn.microsoft.com/en-us/library/ms178544.aspx

    https://msdn.microsoft.com/en-us/library/bb510624.aspx

Posted in Certifications, Development, MCSA 70-461 | Tagged: , | Leave a Comment »

SQL Server Tuning – Top 10 high memory consuming Queries

Posted by Pramod Singla on March 23, 2017


Problem Statement: Script to find top 10 high memory consuming Queries.

Solution:

SELECT TOP 10 Object_name(qt.objectid) 
              AS 'SP Name', 
              Substring(qt.text, ( qs.statement_start_offset / 2 ) + 1, 
              ( 
              ( CASE qs.statement_end_offset 
              WHEN -1 THEN Datalength(qt.text) 
              ELSE qs.statement_end_offset 
                                                                            END 
                - 
                qs.statement_start_offset ) / 2 ) + 1) 
              AS statement_text, 
              total_logical_reads, 
              qs.execution_count 
              AS 'Execution Count', 
              total_logical_reads / qs.execution_count 
              AS 'AvgLogicalReads', 
              qs.execution_count / Datediff(minute, qs.creation_time, Getdate()) 
              AS 
              'Calls/minute', 
              qs.total_worker_time / qs.execution_count 
              AS 'AvgWorkerTime', 
              qs.total_worker_time 
              AS 'TotalWorkerTime', 
              qs.total_elapsed_time / qs.execution_count 
              AS 'AvgElapsedTime', 
              qs.total_logical_writes, 
              qs.max_logical_reads, 
              qs.max_logical_writes, 
              qs.total_physical_reads, 
              qt.dbid, 
              qp.query_plan 
FROM   sys.dm_exec_query_stats AS qs 
       CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt 
       OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) qp 
WHERE  qt.dbid = Db_id() -- Filter by current database 
ORDER  BY total_logical_reads DESC  

Posted in DBA, Interview Questions, Performance Tuning, Scripts | Tagged: , , , , | Leave a Comment »

Exam 70-775: Big Data Engineering with HDInsight

Posted by Pramod Singla on March 22, 2017


1) Administer and Provision HDInsight Clusters

2) Implement Big Data Batch Processing Solutions

3) Implement Big Data Interactive Processing Solutions

  • Implement interactive queries for big data with Spark SQL

  • Perform exploratory data analysis by using Spark SQL

  • Implement interactive queries for big data with Interactive Hive

    • Enable Hive LLAP through Hive settings, manage and configure memory allocation for Hive LLAP jobs, connect BI tools to Interactive Hive clusters

        • Enable Hive LLAP through Hive settings
        • Manage and configure memory allocation for Hive LLAP jobs

          Through Ambari->Hive->Configs->Interactive Query

        • Connect BI tools to Interactive Hive clusters
        • Perform interactive querying and visualization
        • Use Ambari Views
        • Use HiveQL
        • Parse CSV files with Hive
           CREATE TABLE TAB_NAME (COL1 COL_TYPE1 COL2 COL_TYPE2)
           ROW FORMAT DELIMITED FIELDS TERMIBATED BY
           LOAD DATA LOCAL INPATH 'wasbs://yourcsvfile.csv' INTO TABLE TAB_NAME
          
        • Use ORC versus Text for caching
           CREATE TABLE IF NOT EXISTS TAB_NAME (COL1 COL_TYPE1 COL2 COL_TYPE2)
           STORED AS ORC
          
        • Use internal and external tables in Hive
        • Use Zeppelin to visualize data
  • Perform exploratory data analysis by using Hive

    • Perform interactive querying and visualization, use Ambari Views, use HiveQL, parse CSV files with Hive, use ORC versus Text for caching, use internal and external tables in Hive, use Zeppelin to visualize data

    • Useful Links
  • Perform interactive processing by using Apache Phoenix on HBase

    • Use Phoenix in HDInsight; use Phoenix Grammar for queries; configure transactions, user-defined functions, and secondary indexes; identify and optimize Phoenix performance; select between Hive, Spark, and Phoenix on HBase for interactive processing; identify when to share metastore between a Hive cluster and a Spark cluster

      1. Useful Links

4) Implement Big Data Real-Time Processing Solutions

Useful links:

Posted in Big Data, Certifications, Hadoop, Spark | Tagged: , , , | Leave a Comment »

SQL Server – Database Consistency Check DBCC CheckDB

Posted by Pramod Singla on March 13, 2017


Problem: How to check the consistency of a SQL Server database.

Solution:

DBCC checkdb WITH no_infomsgs, all_errormsgs;

  • This Command can put overhead on the database.

  • Normally schedule to run at off peak hours.

  • See the database consistency history report by right click the database>select Reports>Select Standard Reports>Select Database Consistency history

Figure 1: Database Consistency History

 

Posted in DBA | Tagged: , , | Leave a Comment »