In order to work with real examples that you can follow along, first we are going to create two temporary tables named “people” and “salary” and then these two temporary tables will be used for the rest of the exercise. The benefit of this approach is that you can copy and paste the table creation part and then practice along the following steps to maximize the learning opportunity. Please note, you do not need to understand the table creation part of the work at this point. The part you need to understand starts after that.
So, just go ahead, copy and paste the below into your favorite SQL tool and hit run.
1. Basics
Now let’s take a look at what we created in the two temporary tables above. I am going to frame these as questions and answers so that you can also take a stab at solving each question first, before looking at the answer that I have provided. What I have provided is just one way to solve these questions — usually there is more than one right way to get to the same answer.
Q1: Grab all the data available in table people.
A1:
In general, I will first provide the answers and then will explain what happened and how different SQL concepts work, either under the same question or the one following. For example, Q2 will explain what happened in Q1.
SELECT * FROM people;
Results:
And for future reference, this is what table salary looks like, using the following:
SELECT * FROM salary;
Pro Tip: At this point, I suggest you take a snapshot of these two tables — I’ve added instructions below on how to take a snapshot. You will want to refer back to the tables to write queries going forward and scrolling up and down in this post to find these two tables won’t be fun. For convenience, I have included instructions on how to take a snapshot on Mac and Windows:
Mac Users — Press and hold these three keys together: Shift, Command, and 4 and then select the area for the snapshot. If you do not wish to manually select the snapshot area, Pressing Shift, Command and 3 together will do a print screen.
Windows Users — Press these three keys together: Windows Key + Shift + S and then create a snapshot using the Snip and Sketch tool (I am on a Mac so this is based on a Google search — hope it works).
Q2: Explain what happened in the previous query.
A2: SELECT statement is used to select data from a given database (i.e. table). FROM is used to indicate which database the data should be selected from (“people” in this example). And finally * simply says select everything from that database.
Q3: Show me only the data from the following columns: ['name', 'gender', 'job_title'].
A3:
SELECT name, gender, job_title FROM people;
Results:
This one was straightforward — we can indicate explicitly was columns are to be selected.
Q4: Which individuals work in New York?
A4:
SELECT name FROM people WHERE location = 'new_york';
Results:
As demonstrated in this example, we can filter the results using WHERE.
3. Aggregations
Aggregations help us do some basic analytics on the data. Examples include COUNT(), MAX(), MIN(), AVG(), and SUM().
Q5: How many individuals live either in Seattle or Phonenix?
A5:
SELECT count(person_id) FROM people WHERE location IN ('seattle', 'phoenix');
Results:
Q6: How many individuals live in each of the cities?
A6:
SELECT location, count(person_id) FROM people GROUP BY location;
Results:
We already knew what COUNT() this from the previous question but now there is also a new concept at the end of the query. GROUP BY statement groups rows of data with the same values together and is used with aggregation functions, such as COUNT(), MAX(), MIN(), AVG(), and SUM().
Q7: What is the total salaries in Phoenix and New York?
A7:
SELECT location, SUM(salary) AS total_salary FROM people WHERE location = 'phoenix' OR location = 'new_york' GROUP BY location;
Results:
There are two new learnings in this example. First one is that in this example, we assigned an alias to the SUM() column and the resulting column is now named total_salary. Had we not done so, the column would have looked as below:
Second learning is an alternative approach. Note that in the WHERE, we are using OR, which determines location can either be phoenix or new_york. Alternatively, we could have used location IN ('phoenix', 'new_york'), similar to the approach we used in Q5. This does not change the results and is simply an alternative approach.
4. Joins
This is one of the most useful features. It. helps to visualize the data when trying to understand the queries. Let’s look at an example first and then I will explain what is happening.
Q8: Show me the individual names, their locations and salaries, along with the average salary in each location.
A8: Note this is a new type of question that we had not tackled before. Individual names, locations and their salaries are available in table people while average salary by location is available in table salary. What we need is to JOIN these two tables together, which is possible because both tables have the name of the cities in common (although the columns are named location in people and city in salary). How we structure this is as follows:
SELECT p.name, p.location, p.salary, s.average_salary FROM people p LEFT JOIN salary s ON p.location = s.city;
Results:
So what exactly happened? Let’s look at the join more closely:
FROM people p LEFT JOIN salary s ON p.location = s.city
The above says that look at table people and then join it to table salary. Then we need to indicate columns in each table that share the same values (imagine stitching these two tables together only where there are common values in those two particular columns). The columns with mutual values are location from people and city from salary. Lastly, we are using the concept of aliases here. For example, table people now has an alias of p, while table salary has an alias of s. When we want to indicate column location from people, we could either indicate that as people.location or p.location. The same applies to how we indicate column city from salary as s.city. Note that the structure is always as table_name.column_name. Lastly, we used a LEFT JOIN in this example. Below I will discuss what that means and what other ways are there to join tables.
As mentioned, we used a LEFT JOIN in this example. It is helpful to have the below picture in mind when thinking about how JOIN works.
Below are some of the most common JOINtypes:
JOIN or INNER JOIN: Returns the data that have matching values in both left and right tables.
LEFT JOINor LEFT OUTER JOIN:Returns the data from the left table and all the matched data from the right table.
RIGHT JOINor RIGHT OUTER JOIN: This is the flip side of the LEFT JOIN. It returns the data from the right table and all the matched data from the left table.
FULL JOINor FULL OUTER JOIN: Returns all the data from either the left or right table when there is a match between the two
5. Conditionals
Q9: Create a new column in people that breaks down the job titles into tech and non-tech. Only return names, job titles, the new category named as job_group and salaries.
A9: In order to answer this question, first we need to see what unique job titles are available in people using the following query:
SELECT DISTINCT job_title FROM people;
Note that DISTINCT is the statement that was added to SELECT to only return unique (or distinct) values in that column, as demonstrated in the results below:
Now that we know what job titles exist, we can break them down into tech and non-tech as instructed. For this exercise, use your best judgement to determine what role is tech and what is non-tech — important part is to learn how to implement this logic in the query. For example, financial_analyst is a non-tech role, while data_scientist is a tech role. The following query does just that:
SELECT name, job_title, CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, salary FROM people;
Results:
Let’s talk about how CASE expression works by looking more closely at what we did for this question:
CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group,
In order to implement this logic, we first start with theCASE expression and then identify the conditions, using WHEN, such as:
WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech'
The above script, first looks at the column job_title, and if the value of that column for any row is either software_developer or data_scientist, it then outputs tech. The same logic applies for the non-tech category.
Then there is ELSE job_title. This clause says that if a value is encountered in the column that was not covered under the WHEN conditions above, return the value that exists in the column job_title. For example, if we had a row in column job_title with the value of chef, since chef was not among the ones that we had included in the WHEN conditions (i.e. software_developer, data_scientist, financial_analyst, and physician), then this clause would return the original value in column job_title, which was chef.
Last part of the script is END AS job_group. CASE expression finishes with an END and the AS job_group is the alias that is given to the resulting column. That is why the column is named job_group in the results table.
Pro Tip: The order of the conditional statements, starting with WHEN, is important. Once a condition is true, it will stop reading the rest of the conditions and returns the result. If none of the conditions is met, it returns the value in the ELSE clause.
Q10: Which job_group makes more money on average? Order the results from the highest to the lowest job_group.
A10: I will include two approaches to answer this question, each one will introduce new concepts. First I am going to cover an approach to show how to use FROM directly on an output of a query and then I will cover a different approach to demonstrate application of WITH clause.
Approach 1:
SELECT job_group, AVG(salary) AS average_salary FROM ( SELECT person_id, CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, salary FROM people ) GROUP BY job_group ORDER BY average_salary DESC;
Results:
So what exactly happened? In the past we always included the name of a table after FROM, but the new concept here is that we can also include an inner query instead of the name of the table. What happens is that the inner query after the FROM is run first and then the results are used as a temporary table to select data from in the outer query.
Approach 2:
I will start by including the query used in this approach and then will explain each of the steps within the query. Let’s start with the query first:
WITH count_table AS( SELECT CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, count(person_id) as total_count FROM people GROUP BY 1 ), total_salary_table AS( SELECT CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, SUM(salary) as total_salary FROM people GROUP BY 1 ) SELECT ct.job_group, tst.total_salary / ct.total_count as average_salary FROM count_table ct INNER JOIN total_salary_table tst ON ct.job_group = tst.job_group ORDER BY average_salary DESC;
Now let’s look more closely at what happened.
First we will create a temporary table using the WITH clause, named count_table. This temporary table shows how many individuals are within each job_group, as shown below:
Next we will create a second temporary table named total_salary_table that shows the total salary for each job_group, as shown below:
Now that we have these two tables, we can join them together to find the average salary for each job_group, as follows:
Results here match the results from approach 1 as expected.
6. Window Functions
These functions are more challenging to visualize initially so do not be disappointed if you do not get it right away. Go through a few examples and you will start understanding them much better. We usually include these in our interviews so I do recommend that you understand and practice these.
Window functions generally perform a calculation across a set of table rows that are somehow related. In other words, they are somehow similar to aggregations with some caveats. Unlike aggregations, window functions do not cause rows to be grouped together. This will become clearer with examples.
A few examples of window functions are: COUNT(), AVG(), SUM() , ROW_NUMBER(), RANK(), DENSE_RANK(), LAG, and LEAD and they are structured as follows:
window_function(column_name) OVER ( PARTITION BY column_name ORDER BY column_name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS output_alias;
Example above shows the most common elements of how window functions are structured but not all of them are necessary. I know so far this does not make sense so let’s look at some of them in practice to better understand them.
Q11: Create a rank of the salaries overall and also by gender from highest to the lowest salary.
A11:
SELECT name, gender, salary, RANK() OVER( ORDER BY salary DESC ) AS salary_rank_overall, RANK() OVER( PARTITION BY gender ORDER BY salary DESC ) AS salary_rank_by_gender FROM people ORDER BY salary_rank_overall, salary_rank_by_gender;
Results:
As you can see in the results, using RANK() window function, we were able to rank salaries both from an overall perspective and also within each of the male or female categories. Inside the OVER(), we indicated that we want the ranking to be broken down (or “partitioned”) by gender and also to be sorted by salary from highest to the lowest by adding ORDER BY salary DESC.
Q12: Create a running sum of each salary both overall (i.e. sum at row 2 should be row 1 + row 2, sum at row 3 should be sum of row 1 + row 2 + row 3, and so forth) and also by gender, ordered by age (from older to younger). Also include a column that shows the total salary at each row.
A12:
SELECT name, gender, birth_year, salary, SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_overall, SUM(salary) OVER( PARTITION BY gender ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_by_gender, SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And UNBOUNDED FOLLOWING ) AS total_salary FROM people ORDER BY running_total_salary_overall, running_total_salary_by_gender;
Results:
Once you go through the explanations below and compare them to the results, I am sure you will also appreciate how powerful window functions are.
This one used some new statements so let’s look at them more closely. Let’s look at the following first:
SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_overall,
SUM(salary) is simply adding up the salaries but how the salaries should be added up is the more interesting part and is instructed in the OVER() statement. First the salaries are sorted by the year individuals were born from lowest to the highest, as stated in ORDER BY birth_year, (absence of DESC tells us it is in an ascending order). Then ROWS tells how the SUM() should be applied to these rows of data. In this case, summation is applied to all the rows before each row (UNBOUNDED PRECEDING) up to and including the row itself (CURRENT ROW).
Now what if we wanted the summation at each row to include everything before and also everything after that specific row? That can be accomplished by the following:
ROWS between UNBOUNDED PRECEDING And UNBOUNDED FOLLOWING
This structure was used to calculate the total_salary in the following part of the query and you can see in the results section:
SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And UNBOUNDED FOLLOWING ) AS total_salary
Since for each row we wanted to have the summation of all the rows before and after a given row, essentially we asked to see the total summation of that column shown at each row under total_salary, which was the same value of 868,000 for every row.
8. Miscellaneous — UNION Operator, Null Handling and Date Management
So far we covered some of the more common concepts that can help you write your own queries. In this part, I am going to cover a few additional topics that can also help you in interviews.
Similar to last time, let’s first create two new tables named misc_part1 and misc_part2 and then go through the concepts. For now, just copy, paste and run the below script to create the temporary tables for this part of the exercise.
Bonus Point (Optional): Now that you are more familiar with different SQL concepts, take a look at what you copy and paste and see if you can follow the logic. You can see that we first define a table, then specify columns within the table and the data type associated with each column, followed by adding values for each column (in the format of a row of values). That’s it! Now you can create your own temporary tables and start retrieving data from them!
Now let’s see what the tables look like. I am going to use the following to look at mist_part1:
SELECT * FROM misc_part1:
Results:
and then will use the following to see misc_part2:
SELECT * FROM misc_part2;
Results:
We can think of these two tables as some marketing data about the last time clients were contacted and also how they were contacted, such as via email, phone call, etc. There are three observations just by looking at the data:
Both tables have the same column names, so maybe there is an opportunity to combine them as part of our exercise, which we will further explore under UNION.
Some of the values are missing, which was not the case in people and salary tables. For example, looking at misc_part2, cells B5 and B6 are empty. There are also some missing values in misc_part1. We will discuss null handling.
Both tables include a date format column, which we did not have in people and salary tables. We will use these values for some date manipulation.
Similar to last time, feel free to snapshot these two tables for your reference and then let’s continue with the same question and answer format to cover some new concepts.
8.1. UNION Operator
Q13: I see that both tables include the same columns. Can you combine them into one table? Sort the results by name and also identify which row belongs to which table.
A13: Remember that we started with two tables, each of them include 8 rows of data (excluding the headers) so we expect the combined table to include 16 rows of data (excluding the headers).
This can be done using UNION ALL in the following format:
SELECT *, 1 AS misc_table_number FROM misc_part1 UNION ALL SELECT *, 2 AS misc_table_number FROM misc_part2 ORDER BY name;
Results:
Results include 16 rows of data (excluding the headers) as we expected. Let’s talk about what happened.
UNION ALL operator puts the data from each of the queries and stacks them on top of each other. There are a few conditions that need to be met for this to work properly:
Every SELECT within the UNION ALL operator must have the same number of columns. For example, in our exercise each table includes 4 columns.
Corresponding columns from each SELECT statement must have the same data type. For example, name in both of tables are in VARCHAR(30)data type, or last_contacted in both tables are in DATE data type.
Columns of each of the SELECT statements must be in the same order. In other words, the order in both tables in our example tables should be name, last_contacted, contact_type, misc_table_number. This condition was also met in our example and that is why we were able to use UNION ALL.
Q14: Looking at the results of Q13, it looks like both tables included the same data for david and elizabeth. Can you create the same table but only include unique rows (i.e. de-dupe the results)? No need to show which table rows belong to.
A14: This can easily be done by using UNION, instead of UNION ALL. In other words, UNION only selects distinct values, while UNION ALL selects all the values. The query is as follows:
SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY name;
Results:
As expected, results now only include distinct rows and the total number of rows is now 14 (excluding headers), instead of 16 that we saw in Q13.
8.2. Null Handling
Q15: Create a temporary table named combined_table that includes distinct rows of the combined tables (similar to Q14). Where last_contacted value is missing, enter the value as 1901-01-01. We also know that contact_type that is missing is phone_call so fill those in as well.
A15:
WITH combined_table as ( SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY name ) SELECT name, NVL(last_contacted, '1901-01-01') as last_contacted, COALESCE(contact_type, 'phone_call') AS contact_type FROM combined_table;
Results:
Results are as we expected so let’s talk about what happened.
There are two null functions that we used in this exercise and for our purposes they are both similar. They both return an alternative value when an expression is NULL. I wanted to use both to showcase them both but you can choose to use either one. Let’s look at them more closely:
NVL(last_contacted, '1901-01-01') as last_contacted, COALESCE(contact_type, 'phone_call') AS contact_type
The first one is saying that when encountering NULL in column last_contacted, replace that with 1901-01-01. Similarly, the second one is saying that when encountering NULL in column contact_type, replace that with phone_call, as instructed in the question.
Pro Tip: Depending on the SQL environment, statements can slightly vary but concepts remain the same. For example, COALESCE() is used across MySQL, SQL Server, Oracle, and MS Access, while NVL() mainly works with Oracle.
8.3. Date Management
Q16: Starting with the combined_table, create separate columns for year, quarter, month and date of the last time individuals were contacted, where such a date is available.
A16: Let’s look at the structure and results first and then talk through it.
WITH combined_table as ( SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY name ) SELECT name, last_contacted, DATE_PART(year, last_contacted) AS year_contacted, DATE_PART(quarter, last_contacted) AS quarter_contacted, DATE_PART(month, last_contacted) AS month_contacted, DATE_PART(day, last_contacted) AS day_contacted, contact_type FROM combined_table WHERE last_contacted IS NOT NULL;
Results:
There are two new concepts to cover in this exercise. First one is to extract one specific part (e.g. year, quarter, month, or day) from a date, which can be implemented as follows:
DATE_PART(year, last_contacted) AS year_contacted,
Idea is simple. year identifies which part of the date is to be extracted and then it is followed by the column name where the original date exists, which is last_contacted in this example.
Second concept is to filter out the rows with NULL values, which was accomplished using the following:
WHERE last_contacted IS NOT NULL
Pro Tip: In SQL Server, instead of DATE_PART(), one uses DATEPART().
Section 3 — Cheat Sheet
I have included the cheat sheet that I had developed for myself here. Before you start using it, I do have two parting recommendations about this and other cheat sheets:
When possible, build your own cheat sheet over time, instead of relying on a prepared cheat sheet. You will learn much more when you go through creating and updating it on your own, compared to one that is shared with you.
If you do not have the time to create your own cheat sheet, take an existing cheat sheet and make it your own. What I mean is that start with the existing cheat sheet but add to it, revise it, edit it and at some point it will become “your” cheat sheet and you will continue to learn along the way.
Summary
In this post, we started with discussing the importance of data retrieval using SQL for a Data Scientist. Then we moved on to learning the most common SQL concepts with the help of 16 practice questions and answers. Finally a reference sheet for some of the more common concepts in SQL was provided for future reference.
Thanks for Reading!
This post took me about 9.5 hours to research and prepare. If you found it helpful, please follow me on Medium and support me in getting to 1,000 followers!
Afin de travailler avec des exemples concrets que vous pouvez suivre, nous allons d'abord créer deux tables temporaires nommées "personnes" et "salaire", puis ces deux tables temporaires seront utilisées pour le reste de l'exercice. L'avantage de cette approche est que vous pouvez copier et coller la partie de création de tableau, puis vous entraîner le long des étapes suivantes pour maximiser l'opportunité d'apprentissage. Veuillez noter que vous n'avez pas besoin de comprendre la partie création de table du travail à ce stade. La partie que vous devez comprendre commence après cela.
Alors, allez-y, copiez et collez ce qui suit dans votre outil SQL préféré et appuyez sur Exécuter.
Voyons maintenant ce que nous avons créé dans les deux tables temporaires ci-dessus. Je vais les présenter sous forme de questions et de réponses afin que vous puissiez également essayer de résoudre chaque question en premier, avant de regarder la réponse que j'ai fournie. Ce que j'ai fourni n'est qu'une façon de résoudre ces questions - généralement, il y a plus d'une bonne façon d'arriver à la même réponse.
Q1 : Récupérez toutes les données disponibles dans le tableau people.
A1 :
En général, je vais d'abord fournir les réponses, puis expliquer ce qui s'est passé et comment fonctionnent différents concepts SQL, soit sous la même question, soit sous la suivante. Par exemple, Q2 expliquera ce qui s'est passé en Q1.
SELECT * FROM personnes ;
Résultats:
salaryEt pour référence future, voici à quoi ressemble le tableau , en utilisant ce qui suit :
SELECT * FROM salaire ;
Conseil de pro : à ce stade, je vous suggère de prendre un instantané de ces deux tableaux. J'ai ajouté des instructions ci-dessous sur la façon de prendre un instantané. Vous voudrez vous référer aux tableaux pour écrire des requêtes à l'avenir et faire défiler de haut en bas dans cet article pour trouver que ces deux tableaux ne seront pas amusants. Pour plus de commodité, j'ai inclus des instructions sur la façon de prendre un instantané sur Mac et Windows :
Utilisateurs Mac — Appuyez sur ces trois touches et maintenez-les enfoncées : Maj, Commande et 4, puis sélectionnez la zone pour l'instantané. Si vous ne souhaitez pas sélectionner manuellement la zone d'instantané, appuyez simultanément sur Maj, Commande et 3 pour créer un écran d'impression.
Utilisateurs Windows - Appuyez sur ces trois touches ensemble : Touche Windows + Maj + S, puis créez un instantané à l'aide de l'outil Snip and Sketch (je suis sur un Mac, donc cela est basé sur une recherche Google - j'espère que cela fonctionnera).
Q2 : Expliquez ce qui s'est passé dans la requête précédente.
A2 : SELECTl'instruction est utilisée pour sélectionner des données à partir d'une base de données donnée (c'est-à-dire une table). FROMest utilisé pour indiquer dans quelle base de données les données doivent être sélectionnées ("personnes" dans cet exemple). Et enfin *dit simplement tout sélectionner dans cette base de données.
Q3 : Afficher uniquement les données des colonnes suivantes : ['name', 'gender', 'job_title'].
A3 :
SELECT nom, sexe, job_title FROM personnes ;
Résultats:
Celui-ci était simple - nous pouvons indiquer explicitement si les colonnes doivent être sélectionnées.
Q4 : Quelles personnes travaillent à New York ?
A4 :
SELECT nom FROM personnes WHERE emplacement = 'new_york';
Résultats:
Comme le montre cet exemple, nous pouvons filtrer les résultats à l'aide de WHERE.
3. Agrégations
Les agrégations nous aident à effectuer des analyses de base sur les données. Les exemples incluent COUNT(), MAX(), MIN(), AVG()et SUM().
Q5 : Combien de personnes vivent à Seattle ou à Phonenix ?
A5 :
SELECT count(person_id) FROM people WHERE location IN ('seattle', 'phoenix');
Résultats:
Q6 : Combien d'individus vivent dans chacune des villes ?
A6 :
SELECT location, count(person_id) FROM personnes GROUP BY location ;
Résultats:
Nous savions déjà ce que COUNT()c'était à partir de la question précédente, mais maintenant il y a aussi un nouveau concept à la fin de la requête. GROUP BYL'instruction regroupe des lignes de données avec les mêmes valeurs et est utilisée avec des fonctions d'agrégation, telles que COUNT(), MAX(), MIN(), AVG()et SUM().
Q7 : Quel est le total des salaires à Phoenix et à New York ?
A7 :
SELECT location, SUM(salary) AS total_salary FROM people WHERE location = 'phoenix' OR location = 'new_york' GROUP BY location;
Résultats:
Il y a deux nouveaux apprentissages dans cet exemple. La première est que dans cet exemple, nous avons attribué un alias à la SUM()colonne et la colonne résultante est maintenant nommée total_salary. Si nous ne l'avions pas fait, la colonne aurait ressemblé à ceci :
Le deuxième apprentissage est une approche alternative. Notez que dans le WHERE, nous utilisons OR, qui détermine locationpeut être phoenixou new_york. Alternativement, nous aurions pu utiliser location IN ('phoenix', 'new_york'), similaire à l'approche que nous avons utilisée dans Q5. Cela ne change pas les résultats et est simplement une approche alternative.
4. Jointures
C'est l'une des fonctionnalités les plus utiles. Ce. aide à visualiser les données lorsque vous essayez de comprendre les requêtes. Regardons d'abord un exemple, puis j'expliquerai ce qui se passe.
Q8 : Montrez-moi les noms des personnes, leurs emplacements et leurs salaires, ainsi que le salaire moyen dans chaque emplacement.
R8 : Notez qu'il s'agit d'un nouveau type de question que nous n'avions pas abordé auparavant. Les noms individuels, les emplacements et leurs salaires sont disponibles dans le tableau peopletandis que le salaire moyen par emplacement est disponible dans le tableau salary. Ce dont nous avons besoin, c'est de JOINces deux tables ensemble, ce qui est possible car les deux tables ont le nom des villes en commun (bien que les colonnes soient nommées locationdans peopleet citydans salary). La façon dont nous structurons cela est la suivante :
SELECT p.name, p.location, p.salary, s.average_salary FROM personnes p LEFT JOIN salaire s ON p.location = s.city ;
Résultats:
Alors que s'est-il passé exactement ? Regardons la jointure de plus près :
FROM personnes p LEFT JOIN salaire s ON p.lieu = s.ville
Ce qui précède dit que regarder table peoplepuis le joindre à table salary. Ensuite, nous devons indiquer les colonnes de chaque table qui partagent les mêmes valeurs (imaginez assembler ces deux tables uniquement là où il y a des valeurs communes dans ces deux colonnes particulières). Les colonnes avec des valeurs mutuelles sont locationfrom peopleet cityfrom salary. Enfin, nous utilisons ici le concept d'alias. Par exemple, table peoplea maintenant un alias de p, tandis que table salarya un alias de s. Lorsque nous voulons indiquer la colonne locationde people, nous pouvons soit indiquer cela comme people.locationou p.location. Il en va de même pour la façon dont nous indiquons la colonne cityà partir salaryde s.city. Notez que la structure est toujours aussitable_name.column_name. Enfin, nous avons utilisé a LEFT JOINdans cet exemple. Ci-dessous, je vais discuter de ce que cela signifie et des autres moyens de rejoindre des tables.
Comme mentionné, nous avons utilisé a LEFT JOINdans cet exemple. Il est utile d'avoir l'image ci-dessous à l'esprit lorsque l'on réfléchit à son JOINfonctionnement.
Voici quelques-uns des JOINtypes les plus courants :
JOINor INNER JOIN: Renvoie les données qui ont des valeurs correspondantes dans les tables de gauche et de droite.
LEFT JOINor LEFT OUTER JOIN: Renvoie les données de la table de gauche et toutes les données correspondantes de la table de droite.
RIGHT JOINou RIGHT OUTER JOIN: C'est le revers du LEFT JOIN. Il renvoie les données de la table de droite et toutes les données correspondantes de la table de gauche.
FULL JOINor FULL OUTER JOIN: Renvoie toutes les données de la table de gauche ou de droite lorsqu'il y a une correspondance entre les deux
5. Conditionnels
Q9 : Créez une nouvelle colonne dans peoplequi décompose les titres de poste en technologies et non technologies. Renvoie uniquement les noms, les intitulés de poste, la nouvelle catégorie nommée as job_groupet les salaires.
A9 : Afin de répondre à cette question, nous devons d'abord voir quels titres de poste uniques sont disponibles en peopleutilisant la requête suivante :
SELECT DISTINCT job_title FROM personnes ;
Notez qu'il DISTINCTs'agit de l'instruction qui a été ajoutée pour SELECTne renvoyer que des valeurs uniques (ou distinctes) dans cette colonne, comme le montrent les résultats ci-dessous :
Maintenant que nous savons quels titres de poste existent, nous pouvons les décomposer en technologies et non technologies, comme indiqué. Pour cet exercice, utilisez votre meilleur jugement pour déterminer quel rôle est technique et ce qui n'est pas technique - une partie importante est d'apprendre à implémenter cette logique dans la requête. Par exemple, financial_analystest un rôle non technique, tandis que data_scientistest un rôle technique. La requête suivante fait exactement cela :
SELECT name, job_title, CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, salaire FROM people ;
Résultats:
Parlons du fonctionnement de l' CASEexpression en regardant de plus près ce que nous avons fait pour cette question :
CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group,
Afin de mettre en œuvre cette logique, nous commençons d'abord par l' CASEexpression puis identifions les conditions, en utilisant WHEN, telles que :
WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech'
Le script ci-dessus examine d'abord la colonne job_title, et si la valeur de cette colonne pour une ligne est soit software_developerou data_scientist, il affiche ensuite tech. La même logique s'applique pour la non-techcatégorie.
Alors il y a ELSE job_title. Cette clause indique que si une valeur est rencontrée dans la colonne qui n'était pas couverte par les WHENconditions ci-dessus, renvoie la valeur qui existe dans la colonne job_title. Par exemple, si nous avions une ligne dans la colonne job_titleavec la valeur de chef, car chefne faisait pas partie de celles que nous avions incluses dans les WHENconditions (c'est-à-dire software_developer, data_scientist, financial_analystet physician), alors cette clause renverrait la valeur d'origine dans la colonne job_title, qui était chef.
La dernière partie du script est END AS job_group. CASEL' expression se termine par un ENDet AS job_groupest l'alias donné à la colonne résultante. C'est pourquoi la colonne est nommée job_groupdans le tableau des résultats.
Conseil de pro : l'ordre des instructions conditionnelles, en commençant parWHEN, est important. Une fois qu'une condition est vraie, il arrête de lire le reste des conditions et renvoie le résultat. Si aucune des conditions n'est remplie, elle renvoie la valeur de laELSEclause.
Q10 : Qu'est-ce qui job_grouprapporte le plus d'argent en moyenne ? Ordonnez les résultats du plus élevé au plus bas job_group.
A10 : J'inclurai deux approches pour répondre à cette question, chacune introduira de nouveaux concepts. Je vais d'abord couvrir une approche pour montrer comment utiliser FROMdirectement sur une sortie d'une requête, puis je couvrirai une approche différente pour démontrer l'application de WITHla clause.
Approche 1 :
SELECT job_group, AVG(salary) AS average_salary FROM ( SELECT person_id, CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE titre_emploi END AS groupe_emploi, salaire FROM personnes ) GROUP BY groupe_emploi ORDER BY salaire_moyen DESC ;
Résultats:
Alors que s'est-il passé exactement ? Dans le passé, nous incluions toujours le nom d'une table après FROM, mais le nouveau concept ici est que nous pouvons également inclure une requête interne au lieu du nom de la table. Ce qui se passe, c'est que la requête interne après FROMest d'abord exécutée, puis les résultats sont utilisés comme table temporaire pour sélectionner des données dans la requête externe.
Approche 2 :
Je commencerai par inclure la requête utilisée dans cette approche, puis j'expliquerai chacune des étapes de la requête. Commençons d'abord par la requête :
WITH count_table AS( SELECT CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, count(person_id) as total_count FROM people GROUP BY 1 ), total_salary_table AS( SELECT CASE WHEN job_title IN ('software_developer', 'data_scientist') THEN 'tech' WHEN job_title IN ('financial_analyst', 'physician') THEN 'non-tech' ELSE job_title END AS job_group, SUM(salaire) as total_salary FROM people GROUP BY 1 ) SELECT ct.job_group, tst.total_salary / ct.total_count as average_salary FROM count_table ct INNER JOIN total_salary_table tst ON ct.job_group = tst.job_group ORDER BY average_salary DESC ;
Regardons maintenant de plus près ce qui s'est passé.
Nous allons d'abord créer une table temporaire à l'aide de la WITHclause, nommée count_table. Ce tableau temporaire indique le nombre d'individus dans chaque job_group, comme indiqué ci-dessous :
Ensuite, nous allons créer une deuxième table temporaire nommée total_salary_tablequi affiche le salaire total de chaque job_group, comme indiqué ci-dessous :
Maintenant que nous avons ces deux tableaux, nous pouvons les joindre pour trouver le salaire moyen de chacun job_group, comme suit :
Les résultats ici correspondent aux résultats de l'approche 1 comme prévu.
6. Fonctions de la fenêtre
Ces fonctions sont plus difficiles à visualiser au départ, alors ne soyez pas déçu si vous ne l'obtenez pas tout de suite. Passez en revue quelques exemples et vous commencerez à mieux les comprendre. Nous les incluons généralement dans nos entretiens, je vous recommande donc de les comprendre et de les mettre en pratique.
Les fonctions de fenêtre effectuent généralement un calcul sur un ensemble de lignes de table qui sont liées d'une manière ou d'une autre. En d'autres termes, ils ressemblent en quelque sorte à des agrégations avec quelques mises en garde. Contrairement aux agrégations, les fonctions de fenêtre n'entraînent pas le regroupement des lignes. Cela deviendra plus clair avec des exemples.
Voici quelques exemples de fonctions de fenêtre : COUNT(), AVG(), SUM(), ROW_NUMBER(), RANK(), DENSE_RANK(), LAG, et LEADet elles sont structurées comme suit :
fonction_fenêtre(nom_colonne) OVER ( PARTITION BY nom_colonne ORDER BY nom_colonne ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS output_alias;
L'exemple ci-dessus montre les éléments les plus courants de la structure des fonctions de fenêtre, mais ils ne sont pas tous nécessaires. Je sais que jusqu'à présent, cela n'a pas de sens, alors examinons certains d'entre eux dans la pratique pour mieux les comprendre.
Q11 : Créez un classement des salaires dans l'ensemble et également par sexe, du salaire le plus élevé au salaire le plus bas.
A11 :
SELECT nom, sexe, salaire, RANK() OVER( ORDER BY salaire DESC ) AS salaire_rang_overall, RANK() OVER( PARTITION BY sexe ORDER BY salaire DESC ) AS salaire_rang_by_sexe FROM personnes ORDER BY salaire_rang_overall, salaire_rang_by_sexe ;
Résultats:
Comme vous pouvez le voir dans les résultats, en utilisant RANK()la fonction de fenêtre, nous avons pu classer les salaires à la fois dans une perspective globale et également dans chacune des catégories masculines ou féminines. À l'intérieur du OVER(), nous avons indiqué que nous voulions que le classement soit décomposé (ou «partitionné») par genderet également trié par salaire du plus élevé au plus bas en ajoutant ORDER BY salary DESC.
Q12 : Créer une somme cumulée de chaque salaire à la fois globalement (c'est-à-dire que la somme à la ligne 2 doit être la ligne 1 + la ligne 2, la somme à la ligne 3 doit être la somme de la ligne 1 + la ligne 2 + la ligne 3, et ainsi de suite) et aussi par sexe , classés par âge (du plus vieux au plus jeune). Incluez également une colonne indiquant le salaire total à chaque ligne.
A12 :
SELECT nom, sexe, année_de_naissance, salaire, SUM(salaire) OVER( ORDER BY rows_année_de_naissance entre UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_overall, SUM(salary) OVER( PARTITION BY gender ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_by_gender, SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And UNBOUNDED FOLLOWING ) AS total_salary FROM people ORDER BY running_total_salary_overall, running_total_salary_by_gender ;
Résultats:
Une fois que vous aurez parcouru les explications ci-dessous et que vous les aurez comparées aux résultats, je suis sûr que vous apprécierez également la puissance des fonctions de fenêtre.
Celui-ci a utilisé de nouvelles déclarations, alors regardons-les de plus près. Regardons d'abord ce qui suit :
SUM(salary) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And CURRENT ROW ) AS running_total_salary_overall,
SUM(salary)additionne simplement les salaires, mais la façon dont les salaires doivent être additionnés est la partie la plus intéressante et est indiquée dans la OVER()déclaration. Tout d'abord, les salaires sont triés par année de naissance des individus du plus bas au plus élevé, comme indiqué dans ORDER BY birth_year, (l'absence de DESCnous indique que c'est dans un ordre croissant). Indique ensuite ROWScomment le SUM()doit être appliqué à ces lignes de données. Dans ce cas, la sommation est appliquée à toutes les lignes avant chaque ligne ( UNBOUNDED PRECEDING) jusqu'à et y compris la ligne elle-même ( CURRENT ROW).
Et si nous voulions que la somme de chaque ligne inclue tout ce qui précède et aussi tout ce qui suit cette ligne spécifique ? Cela peut être accompli par ce qui suit :
LIGNES entre UNBOUNDED PRECEDING et UNBOUNDED FOLLOWING
Cette structure a été utilisée pour calculer le total_salarydans la partie suivante de la requête et vous pouvez voir dans la section des résultats :
SUM(salaire) OVER( ORDER BY birth_year ROWS between UNBOUNDED PRECEDING And UNBOUNDED FOLLOWING ) AS total_salary
Étant donné que pour chaque ligne, nous voulions avoir la somme de toutes les lignes avant et après une ligne donnée, nous avons essentiellement demandé à voir la somme totale de cette colonne affichée à chaque ligne sous total_salary, qui était la même valeur de 868,000pour chaque ligne.
8. Divers - Opérateur UNION, gestion des valeurs nulles et gestion des dates
Jusqu'à présent, nous avons couvert certains des concepts les plus courants qui peuvent vous aider à rédiger vos propres requêtes. Dans cette partie, je vais couvrir quelques sujets supplémentaires qui peuvent également vous aider dans les entretiens.
Comme la dernière fois, créons d'abord deux nouvelles tables nommées misc_part1, misc_part2puis passons en revue les concepts. Pour l'instant, copiez, collez et exécutez le script ci-dessous pour créer les tables temporaires pour cette partie de l'exercice.
Point bonus (facultatif) : Maintenant que vous êtes plus familiarisé avec les différents concepts SQL, jetez un œil à ce que vous copiez et collez et voyez si vous pouvez suivre la logique. Vous pouvez voir que nous définissons d'abord une table, puis spécifions les colonnes dans la table et le type de données associé à chaque colonne, puis ajoutons des valeurs pour chaque colonne (sous le format d'une ligne de valeurs). C'est ça! Vous pouvez maintenant créer vos propres tables temporaires et commencer à en récupérer les données !
Voyons maintenant à quoi ressemblent les tables. Je vais utiliser ce qui suit pour regarder mist_part1:
SELECT * FROM misc_part1 :
Résultats:
puis utilisera ce qui suit pour voir misc_part2:
SELECT * FROM misc_part2 ;
Résultats:
Nous pouvons considérer ces deux tableaux comme des données marketing sur la dernière fois que les clients ont été contactés et également sur la manière dont ils ont été contactés, par exemple par e-mail, appel téléphonique, etc. Il y a trois observations rien qu'en regardant les données :
Les deux tables ont les mêmes noms de colonne, il est donc peut-être possible de les combiner dans le cadre de notre exercice, que nous explorerons plus en détail sous UNION.
Certaines valeurs sont manquantes, ce qui n'était pas le cas dans les tables peopleet . salaryPar exemple, en regardant misc_part2, les cellules B5et B6sont vides. Il y a aussi des valeurs manquantes dans misc_part1. Nous discuterons de la gestion des nulls.
Les deux tables incluent une colonne de format de date, que nous n'avions pas dans les tables peopleet . salaryNous utiliserons ces valeurs pour certaines manipulations de date.
Comme la dernière fois, n'hésitez pas à prendre un instantané de ces deux tableaux pour votre référence, puis continuons avec le même format de questions et réponses pour couvrir de nouveaux concepts.
8.1. Opérateur UNION
Q13 : Je vois que les deux tables incluent les mêmes colonnes. Pouvez-vous les combiner en un seul tableau ? Triez les résultats par nom et identifiez également quelle ligne appartient à quelle table.
R13 : N'oubliez pas que nous avons commencé avec deux tableaux, chacun d'eux comprenant 8 lignes de données (à l'exclusion des en-têtes). Nous nous attendons donc à ce que le tableau combiné inclue 16 lignes de données (à l'exclusion des en-têtes).
Cela peut être fait UNION ALLen utilisant le format suivant :
SELECT *, 1 AS misc_table_number FROM misc_part1 UNION ALL SELECT *, 2 AS misc_table_number FROM misc_part2 ORDER BY nom ;
Résultats:
Les résultats incluent 16 lignes de données (à l'exclusion des en-têtes) comme prévu. Parlons de ce qui s'est passé.
UNION ALLL'opérateur place les données de chacune des requêtes et les empile les unes sur les autres. Quelques conditions doivent être remplies pour que cela fonctionne correctement :
Chaque SELECTélément de l' UNION ALLopérateur doit avoir le même nombre de colonnes. Par exemple, dans notre exercice chaque tableau comprend 4 colonnes.
Les colonnes correspondantes de chaque SELECTinstruction doivent avoir le même type de données. Par exemple, namedans les deux tables sont dans VARCHAR(30)le type de données, ou last_contacteddans les deux tables sont dans DATEle type de données.
Les colonnes de chacune des SELECTdéclarations doivent être dans le même ordre. En d'autres termes, l'ordre dans les deux tables de nos exemples de tables doit être name, last_contacted, contact_type, misc_table_number. Cette condition était également remplie dans notre exemple et c'est pourquoi nous avons pu utiliser UNION ALL.
Q14 : En examinant les résultats de la Q13, il semble que les deux tableaux contiennent les mêmes données pour davidet elizabeth. Pouvez-vous créer le même tableau mais n'inclure que des lignes uniques (c'est-à-dire dédoubler les résultats) ? Il n'est pas nécessaire d'indiquer à quelles lignes du tableau appartiennent.
A14 : Cela peut facilement être fait en utilisant UNION, au lieu de UNION ALL. En d'autres termes, UNIONne sélectionne que des valeurs distinctes, tandis que UNION ALLsélectionne toutes les valeurs. La requête est la suivante :
SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY nom ;
Résultats:
Comme prévu, les résultats n'incluent désormais que des lignes distinctes et le nombre total de lignes est désormais de 14 (hors en-têtes), au lieu de 16 au Q13.
8.2. Traitement nul
Q15 : Créez une table temporaire nommée combined_tablequi inclut des lignes distinctes des tables combinées (similaire à Q14). Lorsque last_contactedla valeur est manquante, entrez la valeur sous la forme 1901-01-01. Nous savons aussi que ce contact_typequi manque est phone_calldonc remplissez-les également.
A15 :
WITH combination_table as ( SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY name ) SELECT name, NVL(last_contacted, '1901-01-01') as last_contacted, COALESCE(contact_type, 'phone_call') AS contact_type FROM combination_table ;
Résultats:
Les résultats sont conformes à nos attentes, alors parlons de ce qui s'est passé.
Il y a deux fonctions nulles que nous avons utilisées dans cet exercice et pour nos besoins, elles sont toutes les deux similaires. Ils renvoient tous les deux une valeur alternative lorsqu'une expression est NULL. Je voulais utiliser les deux pour les présenter tous les deux, mais vous pouvez choisir d'utiliser l'un ou l'autre. Regardons-les de plus près :
NVL(last_contacted, '1901-01-01') as last_contacted, COALESCE(contact_type, 'phone_call') AS contact_type
Le premier dit que lorsque vous rencontrez NULL dans la colonne last_contacted, remplacez-le par 1901-01-01. De même, le second dit que lorsque vous rencontrez NULL dans la colonne contact_type, remplacez-le par phone_call, comme indiqué dans la question.
Conseil de pro : selon l'environnement SQL, les instructions peuvent légèrement varier, mais les concepts restent les mêmes. Par exemple,COALESCE()est utilisé sur MySQL, SQL Server, Oracle et MS Access, alors qu'ilNVL()fonctionne principalement avec Oracle.
8.3. Gestion des dates
Q16 : En commençant par combined_table, créez des colonnes séparées pour l'année, le trimestre, le mois et la date de la dernière fois que les personnes ont été contactées, lorsqu'une telle date est disponible.
R16 : Examinons d'abord la structure et les résultats, puis discutons-en.
WITH combination_table as ( SELECT * FROM misc_part1 UNION SELECT * FROM misc_part2 ORDER BY name ) SELECT name, last_contacted, DATE_PART(year, last_contacted) AS year_contacted, DATE_PART(trimestre, last_contacted) AS quarter_contacted, DATE_PART(month, last_contacted) AS month_contacted, DATE_PART (jour, dernier_contacté) AS jour_contacté, type_contact FROM tableau_combiné WHERE dernier_contacté IS NOT NULL ;
Résultats:
Il y a deux nouveaux concepts à couvrir dans cet exercice. La première consiste à extraire une partie spécifique (par exemple, année, trimestre, mois ou jour) d'une date, qui peut être implémentée comme suit :
DATE_PART(year, last_contacted) AS year_contacted,
L'idée est simple. yearidentifie quelle partie de la date doit être extraite, puis elle est suivie du nom de la colonne où se trouve la date d'origine, qui est last_contacteddans cet exemple.
Le deuxième concept consiste à filtrer les lignes avec des valeurs NULL, ce qui a été accompli en utilisant ce qui suit :
WHERE last_contacted IS NOT NULL
Conseil de pro : dans SQL Server, au lieu deDATE_PART(), on utiliseDATEPART().
Section 3 — Aide-mémoire
J'ai inclus la feuille de triche que j'avais développée pour moi-même ici. Avant de commencer à l'utiliser, j'ai deux recommandations à ce sujet et sur d'autres aide-mémoire :
Dans la mesure du possible, créez votre propre feuille de triche au fil du temps, au lieu de vous fier à une feuille de triche préparée. Vous en apprendrez beaucoup plus lorsque vous le créerez et le mettrez à jour par vous-même, par rapport à celui qui est partagé avec vous.
Si vous n'avez pas le temps de créer votre propre feuille de triche, prenez une feuille de triche existante et personnalisez-la. Ce que je veux dire, c'est que commencez par la feuille de triche existante, mais ajoutez-y, révisez-la, modifiez-la et à un moment donné, elle deviendra "votre" feuille de triche et vous continuerez à apprendre en cours de route.
-- Introduction
SELECT column_1, column_2… [Returns entries per specified columns] SELECT * [Returns all entries] SELECT DISTINCT column_1, column_2… [Returns unique entries per specified columns] SELECT DISTINCT * [Returns all unique entries]
FROM schema_name.table_name [Specifies table to return entries from]
WHERE column_1 = ‘value’ [Specifies condition per column] May use =, >, <, >=, <=, <>
WHERE column_1 = ‘value’ AND column_2 = ‘value’ [Specifies both conditions must be met]
WHERE column_1 = ‘value’ OR column_2 = ‘value’ [Specifies one condition must be met]
WHERE column_1 = ‘value’ AND (column_2 = ‘value’ OR column_3 = ‘value’)
-- Aggregations
SUM(column_1) [Returns summation of column]
AVG(column_1) [Returns average of column]
MIN(column_1) [Returns minimum of column]
MAX(column_1) [Returns maximum of column]
COUNT(column_1) [Returns count of entries of column]
COUNT(DISTINCT column_1) [Returns unique count of entries of column]
Aggregations require a GROUP BY clause, i.e. GROUP BY 1, 2 or GROUP BY column_1, column_2 All non-aggregated columns are required to be in the GROUP BY clause
-- Null Handling
IS NULL [Evaluates if an entry is null]
IS NOT NULL [Evaluates if an entry is not null]
NVL(column_1, ‘alternative value’) [Replaces entry with an alternative value if found to be null]
NULLIF(column_1,’value’) [Replaces entry with null if entry meets value]
-- Aliases
column_1 AS ‘value’ [Renames column]
schema_name.table_name AS ‘value’ [Renames table]
-- Joins (added in From and then join on primary keys)
INNER JOIN schema_name.table_name ON table1.value1 = table2.value1 [Merges tables on matching values, contains only records with matching values found in both table 1 and table 2]
LEFT JOIN schema_name.table_name ON table1.value1 = table2.value1 [Merges tables on matching values, contains all records from table 1]
RIGHT JOIN schema_name.table_name ON table1.value1 = table2.value1 [Merges tables on matching values, contains all records from table 2]
FULL OUTER JOIN schema_name.table_name ON table1.value1 = table2.value1 [Merges tables on matching values, contains all records from both table 1 and table 2]
-- UNIONs
UNION [Merges two tables whose structures are the same, removes duplicative records] UNION ALL [Merges two tables whose structures are the same]
-- Miscellaneous
LIMIT x [Limits returned records to a specified number]
ORDER BY 1, 2 or ORDER BY column_1, column_2 [Orders returned records by specified columns]
LIKE ‘%value%’ [Used to find near matches when evaluating conditions]
IN (‘value 1’,’value 2’) [Used to find matches within a list of values when evaluating conditions]
BETWEEN ‘value 1’ AND ‘value 2’ [Used to find matches where an entry falls between two values when evaluating conditions] (inclusive)
HAVING [Used in place of WHERE for aggregate clauses] (WHERE filters before data is grouped. HAVING filters after data is grouped.)
CASE WHEN column1 = ‘value’ THEN ‘value’ ELSE ‘value’ END AS ‘value’ [Returns entries by evaluating when/then statements] CASE WHEN A THEN X WHEN B THEN Y WHEN C THEN Z ELSE W END
NOT [Used to limit to where a condition is not met]
-- Date Management
CURRENT_DATE [Returns current date]
TO_CHAR (column name, 'date format') [converts the date to a character string in a specified format] Date formats: 'YYYY/MM/DD' 'YYYY:MM:DD' 'Q' = Quarter number 'MM' = Month number 'Month' = Month name 'W' = Week of the month (1-5; first week starts on the first day of the month) 'WW' = Week number of year (1-53; first week starts on the first day of the year) 'Day' = Day name 'DDD' = Day of year (1-366) 'DD' = Day of month (1-31) 'D' = Day of week (1-7; Sunday is 1) 'YYYY/MM/DD HH24:MI:SS' = Year, month, day, hour, minutes and seconds
DATE_TRUNC ('datepart', column) [truncates a time stamps expression to a specified date part, such as hour, week or month] Datepart formats: 'year' = truncates to the first second of the first hour of the first day of the year 'quarter' = truncates to the first second/hour/day/quarter 'month' = truncates to the first second/hour/day/month 'week' = truncates to the first second/hour/day/week (Monday) 'day' = truncates to the first second/hour/day
This article explains a simple pure sine wave inverter circuit using Arduino, which could be upgraded to achieve any desired power output as per the user's preference
Contents
Circuit Operation
In the last article we learned how to generate sine wave pulse width modulation or SPWM though Arduino, we are going to use the same Arduino board to make the proposed simple pure sine wave inverter circuit.The design is actually extremely straightforward, as shown in the following figure.
You just have to program the arduino board with the SPWM code as explained in the previous article, and hook it up with some of the external devices.
Pin#8 and pin#9 generate the SPWMs alternately and switch the relevant mosfets with the same SPWM pattern.
The mosfst in turn induce the transformer with high current SPWM waveform using the battery power, causing the secondary of the trafo to generate an identical waveform but at the mains AC level.
The proposed Arduino inverter circuit could be upgraded to any preferred higher wattage level, simply by upgrading the mosfets and the trafo rating accordingly, alternatively you can also convert this into a full bridge or an H-bridge sine wave inverter
Powering the Arduino Board
In the diagram the Arduino board could be seen supplied from a 7812 IC circuit, this could be built by wiring a standard 7812 IC in the following manner. The IC will ensure that the input to the Arduino never exceeds the 12V mark, although this might not be absolutely critical, unless the battery is rated over 18V.
If you have any questions regarding the above SPWM inverter circuit using a programmed Arduino, please feel free to ask them through your valuable comments.
Waveform Images for Arduino SPWM
Image of SPWM waveform as obtained from the above Arduino inverter design (Tested and Submitted By Mr. Ainsworth Lynch)
UPDATE:
Using BJT Buffer Stage as Level Shifter
Since an Arduino board will produce a 5V output, it may not be an ideal value for driving mosfets directly.
Therefore an intermediate BJT level shifter stage may be required for raising the gate level to 12V so that the mosfets are able to operate correctly without causing unnecessary heating up of the devices,. The updated diagram (recommended) can be witnessed below:
Transformer = 9-0-9V/220V/120V current as per requirement.
Battery = 12V, Ah value as per requirement
Delay Effect
To ensure that the mosfet stages initiate with a delay during the Arduino booting or start up, you may modify left side BC547 transistors into delay ON stages, as shown below. This will safeguard the mosfets and prevent them from burning during power switch ON Arduino booting.
Adding an Automatic Voltage Regulator
Just like any other inverter the output from this design can rise to unsafe limits when the battery is fully charged.
The BC547 collectors should be connected to the bases of the left side BC547 pair, which are connected to the Arduino via 10K resistors.
For an isolated version of voltage correction circuit we can modify the above circuit with a transformer, as shown below:
How to Setup
To set up the automatic voltage correction circuit, feed a stable 230V or 110V as per your inverter specs to the input side of the circuit.
Next, adjust the 10k preset carefully such that the red LEDs just light up. That's all, seal the preset and connect the circuit with the above Arduino board for implementing the intended automatic output voltage regulation.
Using CMOS Buffer
Another design for the above Arduino sinewave inverter circuit can be seen below, the CMOS IC is used as an aided buffer for the BJT stage
Important:
In order to avoid an accidental switch ON prior to Arduino booting, a simple delay ON timer circuit may be included in the above design, as shown below:
I am an electronic engineer (dipIETE ), hobbyist, inventor, schematic/PCB designer, manufacturer. I am also the founder of the website: https://www.homemade-circuits.com/, where I love sharing my innovative circuit ideas and tutorials. If you have any circuit related query, you may interact through comments, I'll be most happy to help!
Have Questions? Please Comment below to Solve your Queries! Comments must be Related to the above Topic!!
Subscribe
508 COMMENTS
Newest
Ajit Mohanty
2 days ago
very interesting. can we convert DC to AC 11w tube using hybrid transformer? expecting your valuable guidance.
Hi Swagatam I am Designing the H bridge converter, I required Sinusoidal at high frequency (3kHz). Can you guide me on How should I change the Arduino code to obtain High Frequency at Output? Regards
Hi Leo, you can probably try the following circuit for the H bridge design, but make sure the battery and transformer are 12V rated..webp Changing the code can be plenty of hard work and a lot of thinking. Basically you will have to replace each of the microsecond delays with smaller delays such that they together add up to produce 3000 Hz for each channel.
Bob Gardner
3 months ago
Hello Swagatam. You have great circuits and great explanations to teach folks about the circuits. Here’s my idea: 55 gal blue drum full of 400 pounds of water is my gravity battery. Pull it up to some height with a block and tackle. My goal is to run a 5000 btu ac for 1 hour. 5000btuas/3412btusper kw is about 1400 watts, and running this contraption would give some cool air for an hour. Then make someone else crank the drum back to to run it again. At first I looked into a 3phase PMA alternator that puts out 12V 3phase… Read more »
If you are looking for a circuit to limit the speed and the voltage output of an alternator, then yes that can be done through a simple shunt regulator circuit. Is the alternator a 3 phase alternator or a single phase alternator?
Hello Swagatam. I wish to thank you for all the circuits you make available for free on your site. They are very educational to a amateur radio operator like me. My call is KE4WCE. My question is where in the code for the Arduino Pure Sine Wave Inverter Circuit would I need to change from 50Hz (cycles) to 60Hz and replace it with what number. Or if you don’t mind me asking the math formula and a example so I may able to do it myself. Thank You 73’s
Thank you very much Amos, In the Arduino code explained above we have used the total delay on each channel as 10 ms, since each 50 Hz AC half cycle requires 10ms to complete. For a 60 Hz cycle the completion time of half cycle becomes 12 ms. So you will have to adjust the microsecond time (in brackets) on each channel of the code in such a way that it produces a delay of 12 ms or 12000 microseconds.
HI, I Build the first most simple’s project on the page. After programming my arduino I founded that my irf540 is heating up and when i checked the frequency on my arduino ouput it was around 500 hz . I am most certain thats why the mosfets heat up. I cannot see what i am doing wrong. Advise please
MOSFETs can work with frequencies up to MHz and GHz, so frequency can never be a problem, I have tested the design thoroughly and had no such problems. My MOSFETs also heated up a a little but it was because of the 100 watt load at the output of the transformer. If your MOSFETs are heating up due to an output load then it is normal, simply put a heatsink on it.
VIvek Jadhav
3 months ago
And one more thing…. instead of using high voltage capacitors at transformer output for waveform smoothing. can we use low voltage capacitors at the transformer input??? voltage and power handling will be easier… what say??
I don’t think that might be possible, because at the input side everything is PWM based and digital which cannot be transformed into an analogue sine wave….the transformation may be possible only at the output side of the transformer.
VIvek Jadhav
3 months ago
Hiii Swagatam, The delay circuit is introduced as a safety. But What I say is instead of introducing a circuit why dont we use aurduino digital output signal to switch the DC supply to inverter circuit. mainly for three reasons, 1) Human error is eliminated. 2) circuit is simplified 3) Delay circuit malfunction hazard is also eliminated. Whats your say in this??
Thanks Vivek, My Arduino knowledge is not good, so I won’t be able to suggest on this modification. If you think this is something feasible then surely it can be implemented and tried in this design.
Christiaan Esterhuyse
4 months ago
Best Swagatam
I may be understanding the circuit diagram wrong but I simulated the circuit and it does not seem to work. I don’t understand how the positive terminal can be connected directly to the base of the mosfet. What is the reason for the BJT level shifters? Also a positive voltage is applied again directly (resistor in series… so voltage division?) to the bjt. I would like to build this circuit but cannot seem to understand its working principle. Would it be possible to communicate by email for me to better understand the circuit please?
Christiaan, level shifters are necessary to enable the mosfets gates to get 12V supply so that they can conduct optimally. If we connect the mosfets directly with the Arduino, the gates will be be able to get only 5V which may not be sufficient for the mosfets to switch ON perfectly. The first BC547 inverts the Arduino PWM so it cannot be directly used with the mosfets otherwise the mosfets gates would be switched with inverse PWM signals. The second BC547 corrects this issue and makes the PWM same as the Arduino output, but at 12V level The circuit is… Read more »
Thank you for the response. That makes sense but wont the voltage to the mosfet be 13.3V since this is more the operating voltage of a 12V battery? Is this still a suitable value? Then lastly do you have a inverter circuit that does not use a transformer but still steps up from 12V to 230V? Sincerely Christiaan
Most mosfet gates are designed to work with upto 20V, so 13V is quite OK. Making an 12V to 230V inverter without a transformer is almost impossible, as far as I know.
Lalawmpuia, where do you think the feedback can be connected with the Arduino? It will require special coding for that. Connecting the feedback to MOSFET gates is an universal option which can configured easily without any coding by anybody and with any inverter.
Sir can you please explain how the BC547 collector is connected to the base/gate of the BJT/Mosfet . Since I have a hard time understanding that . Since the emmiter side is connected to ground and does that simply pull down ?
The BC547 collector will pull down the mosfet gate while it is ON, and when it is OFF the mosfet gate will get the switching voltage through the 1K resistor.
How does the feedback works? since the feedback ( bc547) emmiter is connected to the base of the bc547( arduino output ) will it just pull down from bc547 base to ground . Please explain the worling of this feedback . Also I tried your Circuit and code as well but I have burn out 10 mosfet ( 5 times irfz44n and 5 times irf3205 ) in just under 10 seconds . It was too hot to even touch .
When the input voltage to the op amp circuit exceeds the set limit the op amp output and the BC547 conduct and grounds the Arduino BC547 bases causing the mosfets to shut off. As soon as the mosfets are turned off the voltage begins to drop which reverts the opamp output causing the mosfets to switch ON again, and this process continues which keeps the output voltage within the desired range. Your mosfets may be burning due to some fault in your design, or may be the mosfets are not good quality. You can in the video the circuit works… Read more »
Hello sir , I have a question about the transformer , if I use a 6-0-6 transformer rated at 3A , at ideal mode , what will be the rated watt of the output ? Since 3A is the highest rated transformer I can buy from our state .
Hello sir , how are you ? I have one question regarding the feedback , my circuit is exactly same as your circuit , and the transformer I used is from an old ups , and the question is , the output Voltage decrease to 140V from 210V when connected a load . Will the feedback you posted work in this situation to give 220V constant even when connected to a load ( 60W incandescent lamb) ?
Hello Fanai, no, the feedback cannot boost a low voltage happening due to over load or low wattage transformer/battery, it can only prevent an over voltage. To correct the voltage drop you must ensure that the transformer, battery and the mosfets are appropriately rated as per the load requirement.
mike webb
8 months ago
k. well the solution i have now will have to do till i have more time. seems the best way forward will be to build an inverter that doesnt have over voltage protection. when i have time i’ll pull the transformer out of the old inverter, check it to make sure it meets the specs of the Arduino. i’ll also grab the heat sinks for the mosfets. then i’ll have another 1000 questions for you. thank you for your help.
this may be off topic, i’m looking at building this inverter but perhaps its better to tell you what my problem is first because you may have a better solution. i have a 300 watt inverter thats doing just fine. but its connected to lithium ion batterys instead of lead acid. the ion batterys are 4p so top voltage is 16.4. the inverter has i high voltage cut off at 15.5. i’m working around the problem by placing a buck inverter between the batterys and the inverter, the buck inverter is set to output at 14.9 volts. this works but… Read more »
Using a buck converter between the battery and the inverter can be indeed quite wasteful, but unfortunately there’s no other more efficient way than a buck converter, except if one Li-Ion cell is reduced, and a 3S combination is used with a 12.6V output supply. The transformer winding rating must match the PWM average of the Arduino in the above explained concept. The average voltage at the collector of the first BC547 transistor will be the value that must match the half winding specs of the transformer. So definitely you can salvage the transformer from your 3000 watt inverter but… Read more »
Johnny
1 year ago
Hi Mr i have built a small circuit using Bc547 & Bc557 as BJT Buffer stage to drive the Mosfets (i’m using IRFZ46) when i check the ouput of the buffer stage connected to Arduino uploaded with Spwm Code on DSO138 oscilloscope the output is same as the pattern of the wave in the picture on the page But on the Drain of the Mosfet its square wave Why? My circuit using readymade oscillator SG3524 is working fine But using the Arduino the output is connected to lamp & is flashing any idea? thanks for ur Help
I need another circuit which i didnt know how to contact the page a small Solid state relay using mosfets which have Com positive or negative 12V NC NO so i can select which source to use power of 2 12v dc inputs & 1 output \]
That is actually not possible. The ON/OFF switching of the drain is dependent on the ON/OFF switching of its gate, and has to be exactly similar. Not sure why it is not happening for your mosfet…
Alan K Gilman
1 year ago
It’s interesting how people solve a problem based on their background. You designed a delay circuit to insure there are no shoot-through conditions at startup. Being a software engineer, I would have done this in software by driving the relay with transistor and an output from the Arduino :-). Great article, thanks for publishing!
Hi Nimel, the pin8 and pin9 go to the first BC547 via 10k, this BC547 base is configured with the delay timer network, so it is the pin8 nd pin9 which go to the delay network
Partha Gorai
1 year ago
Sir, can you please share the transformar specification, required for this project. Thanks.
Partha, the transformer voltage can be 6-0-6V for a 12V battery. Transformer current will depend on the load wattage. If the load wattage is 200 watt then dividing this by 6 becomes 33 amps and so on.
Patrick
1 year ago
Hello Sir, Thanks for the article,i would love to know more on the programming aspect.Please can you explain in detail how the avr side of the above circuit works to protect the circuit,like when the LED lights up,what does it mean,and can i implement it on all other inverter circuit?
Thank you for liking the post! Yes you an change the frequency to any desired value, simply by altering the “microsecond” values in the code. As you can see, the above applications works with 50 Hz frequency, therefore the microsecond delays in the code across both the channels are adjusted accordingly to a 20 ms value. For 1kHz, this total delay for the entire code will need to be adjusted to 1 ms.
Hi, If you are going to use 1kHz, then you will probably need to design or obtain a transformer suitable for the job. The 50/60Hz transformers used here will not provide the performance at 1 kHz.
The impedance of the transformer is lower at 50/60Hz than at 1 kHz.
dead time already introduced at the end of the codes for each channel
digitalWrite(8, LOW); //……
digitalWrite(9, LOW); } //————————————-//
Jayvardhan pandit
1 year ago
Thanks for quick reply on my previous comments. I want to know one more thing how output frequency calculated. For some spacial applicatoin i want to lower the frequency about 10Hz to 15Hz. Please help.
It will depend on the specific oscillator circuit, and the RC parts used in the oscillator. The frequency can be changed by changing the RC values of the oscillator.
Yes, for the above Arduino, the frequency is dependent on the delay (microsecond) set for each of the PWM blocks. You can modify them accordingly to ensure the waveform delay coincides with the frequency timing or the Hz rate
Jaywardhan
2 years ago
Hi, Nice projects is this, I want to make this same project with 110V DC source, can i replace 12V DC source with 110V? Second thing in place of saperate Power On delay circuit, we can also use Arduino itself for the same.
En poursuivant votre navigation sur mon site,
vous acceptez l’utilisation des Cookies et autres traceurs
pour réaliser des statistiques de visites et enregistrer
sur votre machine vos activités pédagogiques.En savoir plus.
Have Questions? Please Comment below to Solve your Queries! Comments must be Related to the above Topic!!
very interesting. can we convert DC to AC 11w tube using hybrid transformer?
expecting your valuable guidance.
Yes that’s possible. Yu can try the following circuit:
https://www.homemade-circuits.com/electronic-ballast-circuit-for-uv-germicidal-lamps/
Hi Swagatam
I am Designing the H bridge converter, I required Sinusoidal at high frequency (3kHz). Can you guide me on How should I change the Arduino code to obtain High Frequency at Output?
Regards
Hi Leo,
you can probably try the following circuit for the H bridge design, but make sure the battery and transformer are 12V rated..webp
Changing the code can be plenty of hard work and a lot of thinking. Basically you will have to replace each of the microsecond delays with smaller delays such that they together add up to produce 3000 Hz for each channel.
Hello Swagatam. You have great circuits and great explanations to teach folks about the circuits. Here’s my idea: 55 gal blue drum full of 400 pounds of water is my gravity battery. Pull it up to some height with a block and tackle. My goal is to run a 5000 btu ac for 1 hour. 5000btuas/3412btusper kw is about 1400 watts, and running this contraption would give some cool air for an hour. Then make someone else crank the drum back to to run it again. At first I looked into a 3phase PMA alternator that puts out 12V 3phase… Read more »
Thanks Bob! Your concept looks interesting.
If you are looking for a circuit to limit the speed and the voltage output of an alternator, then yes that can be done through a simple shunt regulator circuit. Is the alternator a 3 phase alternator or a single phase alternator?
Hello Swagatam.
I wish to thank you for all the circuits you make available for free on your site. They are very educational to a amateur radio operator like me. My call is KE4WCE.
My question is where in the code for the Arduino Pure Sine Wave Inverter Circuit would I need to change from 50Hz (cycles) to 60Hz and replace it with what number.
Or if you don’t mind me asking the math formula and a example so I may able to do it myself.
Thank You 73’s
Thank you very much Amos,
In the Arduino code explained above we have used the total delay on each channel as 10 ms, since each 50 Hz AC half cycle requires 10ms to complete. For a 60 Hz cycle the completion time of half cycle becomes 12 ms. So you will have to adjust the microsecond time (in brackets) on each channel of the code in such a way that it produces a delay of 12 ms or 12000 microseconds.
Thank you For the fast reply. I am going to have a go at it and see if I can duplicate your circuit. Again Thank you for your time and patience.
73’s
I am glad to help!
HI,
I Build the first most simple’s project on the page. After programming my arduino I founded that my irf540 is heating up and when i checked the frequency on my arduino ouput it was around 500 hz . I am most certain thats why the mosfets heat up. I cannot see what i am doing wrong. Advise please
Regards
Peter
MOSFETs can work with frequencies up to MHz and GHz, so frequency can never be a problem, I have tested the design thoroughly and had no such problems. My MOSFETs also heated up a a little but it was because of the 100 watt load at the output of the transformer. If your MOSFETs are heating up due to an output load then it is normal, simply put a heatsink on it.
And one more thing…. instead of using high voltage capacitors at transformer output for waveform smoothing. can we use low voltage capacitors at the transformer input??? voltage and power handling will be easier… what say??
I don’t think that might be possible, because at the input side everything is PWM based and digital which cannot be transformed into an analogue sine wave….the transformation may be possible only at the output side of the transformer.
Hiii Swagatam, The delay circuit is introduced as a safety. But What I say is instead of introducing a circuit why dont we use aurduino digital output signal to switch the DC supply to inverter circuit. mainly for three reasons, 1) Human error is eliminated. 2) circuit is simplified 3) Delay circuit malfunction hazard is also eliminated. Whats your say in this??
Thanks Vivek,
My Arduino knowledge is not good, so I won’t be able to suggest on this modification. If you think this is something feasible then surely it can be implemented and tried in this design.
Best Swagatam
I may be understanding the circuit diagram wrong but I simulated the circuit and it does not seem to work.
I don’t understand how the positive terminal can be connected directly to the base of the mosfet. What is the reason for the BJT level shifters? Also a positive voltage is applied again directly (resistor in series… so voltage division?) to the bjt. I would like to build this circuit but cannot seem to understand its working principle.
Would it be possible to communicate by email for me to better understand the circuit please?
Kindest Regards Christiaan
Christiaan, level shifters are necessary to enable the mosfets gates to get 12V supply so that they can conduct optimally. If we connect the mosfets directly with the Arduino, the gates will be be able to get only 5V which may not be sufficient for the mosfets to switch ON perfectly. The first BC547 inverts the Arduino PWM so it cannot be directly used with the mosfets otherwise the mosfets gates would be switched with inverse PWM signals. The second BC547 corrects this issue and makes the PWM same as the Arduino output, but at 12V level The circuit is… Read more »
Thank you for the response.
That makes sense but wont the voltage to the mosfet be 13.3V since this is more the operating voltage of a 12V battery? Is this still a suitable value?
Then lastly do you have a inverter circuit that does not use a transformer but still steps up from 12V to 230V?
Sincerely Christiaan
Most mosfet gates are designed to work with upto 20V, so 13V is quite OK.
Making an 12V to 230V inverter without a transformer is almost impossible, as far as I know.
Thanks for the replies Swagatam.
This is definitely one of the only sites where you get replies.
What is the wattage of this inberter?
It’s my pleasure Tiaan.
Hello
Can you show the charging circuit?
All the best
J K BARIK
Sure, you can get the desired battery charger circuit from these two articles:
Lead Acid Battery Charger Circuits
Op amp Battery Charger Circuit with Auto Cut Off
Hello sir , can you please tell me how the feedback circuit works and why you connect them in the gate of the Mosfet but not in the arduino .
Lalawmpuia, where do you think the feedback can be connected with the Arduino? It will require special coding for that. Connecting the feedback to MOSFET gates is an universal option which can configured easily without any coding by anybody and with any inverter.
Sir can you please explain how the BC547 collector is connected to the base/gate of the BJT/Mosfet . Since I have a hard time understanding that . Since the emmiter side is connected to ground and does that simply pull down ?
The BC547 collector will pull down the mosfet gate while it is ON, and when it is OFF the mosfet gate will get the switching voltage through the 1K resistor.
How does the feedback works? since the feedback ( bc547) emmiter is connected to the base of the bc547( arduino output ) will it just pull down from bc547 base to ground . Please explain the worling of this feedback . Also I tried your Circuit and code as well but I have burn out 10 mosfet ( 5 times irfz44n and 5 times irf3205 ) in just under 10 seconds . It was too hot to even touch .
When the input voltage to the op amp circuit exceeds the set limit the op amp output and the BC547 conduct and grounds the Arduino BC547 bases causing the mosfets to shut off. As soon as the mosfets are turned off the voltage begins to drop which reverts the opamp output causing the mosfets to switch ON again, and this process continues which keeps the output voltage within the desired range. Your mosfets may be burning due to some fault in your design, or may be the mosfets are not good quality. You can in the video the circuit works… Read more »
Thank you so much .
Hello sir , I have a question about the transformer , if I use a 6-0-6 transformer rated at 3A , at ideal mode , what will be the rated watt of the output ? Since 3A is the highest rated transformer I can buy from our state .
Hello Fanai, it will be 6 x 3 = 18 watts for each half cycle of AC output, that means for one full AC cycle it can be around 36 watts
Hello sir , how are you ?
I have one question regarding the feedback , my circuit is exactly same as your circuit , and the transformer I used is from an old ups , and the question is , the output Voltage decrease to 140V from 210V when connected a load . Will the feedback you posted work in this situation to give 220V constant even when connected to a load ( 60W incandescent lamb) ?
Hello Fanai, no, the feedback cannot boost a low voltage happening due to over load or low wattage transformer/battery, it can only prevent an over voltage. To correct the voltage drop you must ensure that the transformer, battery and the mosfets are appropriately rated as per the load requirement.
k. well the solution i have now will have to do till i have more time. seems the best way forward will be to build an inverter that doesnt have over voltage protection. when i have time i’ll pull the transformer out of the old inverter, check it to make sure it meets the specs of the Arduino. i’ll also grab the heat sinks for the mosfets. then i’ll have another 1000 questions for you. thank you for your help.
Sure, no problem, all the best to you!
this may be off topic, i’m looking at building this inverter but perhaps its better to tell you what my problem is first because you may have a better solution. i have a 300 watt inverter thats doing just fine. but its connected to lithium ion batterys instead of lead acid. the ion batterys are 4p so top voltage is 16.4. the inverter has i high voltage cut off at 15.5. i’m working around the problem by placing a buck inverter between the batterys and the inverter, the buck inverter is set to output at 14.9 volts. this works but… Read more »
Using a buck converter between the battery and the inverter can be indeed quite wasteful, but unfortunately there’s no other more efficient way than a buck converter, except if one Li-Ion cell is reduced, and a 3S combination is used with a 12.6V output supply. The transformer winding rating must match the PWM average of the Arduino in the above explained concept. The average voltage at the collector of the first BC547 transistor will be the value that must match the half winding specs of the transformer. So definitely you can salvage the transformer from your 3000 watt inverter but… Read more »
Hi Mr
i have built a small circuit using Bc547 & Bc557 as BJT Buffer stage to drive the Mosfets (i’m using IRFZ46)
when i check the ouput of the buffer stage connected to Arduino uploaded with Spwm Code
on DSO138 oscilloscope the output is same as the pattern of the wave in the picture on the page
But on the Drain of the Mosfet its square wave
Why?
My circuit using readymade oscillator SG3524 is working fine
But using the Arduino the output is connected to lamp & is flashing
any idea?
thanks for ur Help
I need another circuit which i didnt know how to contact the page
a small Solid state relay using mosfets which have Com positive or negative 12V
NC NO so i can select which source to use power of
2 12v dc inputs & 1 output \]
Thanks
You must do exactly as shown in the above schematics because it is a tested design, I cannot suggest about any other configuration.
if i connect the oscilloscope to the Drain of the Mosfet should i see the same signal as the arduino output ?
i will test it with 2 npn transistors instead of the buffer stage i have used
thanks for help
Drain waveform should be exactly as its gate waveform.
any suggestions why i get square waveform on the drain when the gate is the same as the waveform in ur pic ?
That is actually not possible. The ON/OFF switching of the drain is dependent on the ON/OFF switching of its gate, and has to be exactly similar. Not sure why it is not happening for your mosfet…
It’s interesting how people solve a problem based on their background. You designed a delay circuit to insure there are no shoot-through conditions at startup. Being a software engineer, I would have done this in software by driving the relay with transistor and an output from the Arduino :-).
Great article, thanks for publishing!
Thank you for the feedback, I completely agree, this could have been solved through a code modification also.
Hi swag, which pin of the arduino do I connect the delay generator to
Hi Nimel, the pin8 and pin9 go to the first BC547 via 10k, this BC547 base is configured with the delay timer network, so it is the pin8 nd pin9 which go to the delay network
Sir, can you please share the transformar specification, required for this project.
Thanks.
Partha, the transformer voltage can be 6-0-6V for a 12V battery. Transformer current will depend on the load wattage. If the load wattage is 200 watt then dividing this by 6 becomes 33 amps and so on.
Hello Sir,
Thanks for the article,i would love to know more on the programming aspect.Please can you explain in detail how the avr side of the above circuit works to protect the circuit,like when the LED lights up,what does it mean,and can i implement it on all other inverter circuit?
Hello Patrick, yes you can use the concept universally with any inverter circuit. More details can be learned from the following artcle:
https://www.homemade-circuits.com/load-independentoutput-corrected/
thanks for the good circuit .
Kindly tell if we want to vary the frequency or if we wants the the output at 1 kHz
Thank you for liking the post!
Yes you an change the frequency to any desired value, simply by altering the “microsecond” values in the code.
As you can see, the above applications works with 50 Hz frequency, therefore the microsecond delays in the code across both the channels are adjusted accordingly to a 20 ms value.
For 1kHz, this total delay for the entire code will need to be adjusted to 1 ms.
Hi,
If you are going to use 1kHz, then you will probably need to design or obtain a transformer suitable for the job.
The 50/60Hz transformers used here will not provide the performance at 1 kHz.
The impedance of the transformer is lower at 50/60Hz than at 1 kHz.
In fact it will be 1000/50 = 20 time higher.at 1 kHz.
https://en.wikipedia.org/wiki/Electrical_impedance
It’s around 700 Hz, which is not too bad for an iron core transformer according to me…in my experiment everything worked fine.
Excellent article.
What would need to change to cater for a 24v battery?
Thank you, no changes will be required, except the transformer and the battery. The transformer can be a 12-0-12V for the primary side
здравствуйте. а можно как-то реализовать между переходами мертвую точку или
dead line
dead time already introduced at the end of the codes for each channel
digitalWrite(8, LOW);
//……
digitalWrite(9, LOW);
}
//————————————-//
Thanks for quick reply on my previous comments. I want to know one more thing how output frequency calculated. For some spacial applicatoin i want to lower the frequency about 10Hz to 15Hz. Please help.
It will depend on the specific oscillator circuit, and the RC parts used in the oscillator. The frequency can be changed by changing the RC values of the oscillator.
Thanks for the reply, but think there is some confusion, The frequency shoud be depends on arduino code coz SPWM is generated by arduino itself…?
Yes, for the above Arduino, the frequency is dependent on the delay (microsecond) set for each of the PWM blocks. You can modify them accordingly to ensure the waveform delay coincides with the frequency timing or the Hz rate
Hi, Nice projects is this, I want to make this same project with 110V DC source, can i replace 12V DC source with 110V? Second thing in place of saperate Power On delay circuit, we can also use Arduino itself for the same.
Hi, thanks, for 110V DC input you will need a 110V transformer connected with the MOSFETs.
you can use the same power for the Arduino and the delay circuit