PYTHON で DATAFRAME を作成する方法 - PYTHON チュートリアル

データフレームは、2 次元のデータの集合です。データが表形式で保存されるデータ構造です。データセットは行と列に配置されます。複数のデータセットをデータフレームに保存できます。データフレームに列/行の選択や列/行を追加するなど、さまざまな算術演算を実行できます。

Python では、Pandas ライブラリの重要なコンポーネントである DataFrame は、包括的な 2 次元データコンテナーとして機能します。テーブルに似ており、それぞれに固有のインデックスを備えた行と列を使用して、データを明確にカプセル化します。その汎用性により、列内でさまざまなデータ型を収容できるため、複雑なデータセットを柔軟に処理できます。

Pandas DataFrame は、ユーザーに広範な機能を提供します。辞書やその他のデータ構造を使用した構造化データの作成から、シームレスなデータアクセスのための堅牢なインデックスの採用に至るまで、Pandas は楽なデータ操作を容易にします。このライブラリは、条件に基づいた行のフィルタリング、集計のためのデータのグループ化、統計分析の実行などの操作を簡単に実行するための直感的なインターフェイスを提供します。

外部ストレージから DataFrame をインポートできます。これらのストレージは、 SQL データベース、CSV ファイル、Excel ファイル。リスト、辞書、辞書のリストなどを使用することもできます。

このチュートリアルでは、複数の方法でデータフレームを作成する方法を学習します。これらのさまざまな方法を理解しましょう。

まず、pandas ライブラリをパイソン環境。

空のデータフレーム

基本的な空のデータフレームを作成できます。 DataFrame を作成するには、データフレームコンストラクターを呼び出す必要があります。次の例を理解してみましょう。

例 -

 # Here, we are importing the pandas library as pd import pandas as pd # Here, we are Calling DataFrame constructor df = pd.DataFrame() print(df) # here, we are printing the dataframe

出力：

 Empty DataFrame Columns: [] Index: []

方法 - 2: List を使用してデータフレームを作成する

単一のリストまたはリストのリストを使用してデータフレームを作成できます。次の例を理解してみましょう。

例 -

 # Here, we are importing the pandas library as pd import pandas as pd # Here, we are declaring the string values in the list lst = [&apos;Java&apos;, &apos;Python&apos;, &apos;C&apos;, &apos;C++&apos;, &apos;JavaScript&apos;, &apos;Swift&apos;, &apos;Go&apos;] # Here, we are calling DataFrame constructor on list dframe = pd.DataFrame(lst) print(dframe) # here, we are printing the dataframe

出力：

 0 Java 1 Python 2 C 3 C++ 4 JavaScript 5 Swift 6 Go

説明：

Import Pandas: import pandas as pd は Pandas ライブラリをインポートし、素っ気ないのでそれを pd と呼びます。
リストの作成: lst は、プログラミング方言に対応する文字列値を含む概要です。
データフレームの開発: pd.DataFrame(lst) は、ランダウン lst からデータフレームを構築します。もちろん、単独の要約が与えられると、Pandas は単独のセクションを持つ DataFrame を作成します。
DataFrame の印刷: print(dframe) は後続の DataFrame を印刷します。

方法 - 3: ndarray/lists の辞書からデータフレームを作成する

ndarray/lists の dict を使用してデータフレームを作成できます。 ndarray 同じ長さでなければなりません。デフォルトでは、インデックスは range(n) になります。ここで、n は配列の長さを表します。次の例を理解してみましょう。

例 -

 # Here, we are importing the pandas library as pd import pandas as pd # Here, we are assigning the data of lists. data = {&apos;Name&apos;: [&apos;Tom&apos;, &apos;Joseph&apos;, &apos;Krish&apos;, &apos;John&apos;], &apos;Age&apos;: [20, 21, 19, 18]} # Here, we are creating the DataFrame df = pd.DataFrame(data) # here, we are printing the dataframe # Here, we are printing the output. print(df) # here, we are printing the dataframe

出力：

 Name Age 0 Tom 20 1 Joseph 21 2 Krish 19 3 John 18

説明：

Import Pandas: import pandas as pd は、Pandas ライブラリをインポートし、それを pd と名付けます。
辞書の作成: 情報は単語参照であり、キーはセグメント名 (「名前」と「年齢」) であり、値は関連情報を含むレコードです。
データフレームの開発: pd.DataFrame(data) はワード参照からデータフレームを構築します。キーはセクション名になり、概要はセグメントになります。
DataFrame の印刷: print(df) は後続の DataFrame を印刷します。

方法 - 4: 配列を使用してインデックスデータフレームを作成する

配列を使用してインデックスデータフレームを作成する次の例を理解しましょう。

例 -

 # Here, we are implementing the DataFrame using arrays. import pandas as pd # Here, we are importing the pandas library as pd # Here, we are assigning the data of lists. data = {&apos;Name&apos;:[&apos;Renault&apos;, &apos;Duster&apos;, &apos;Maruti&apos;, &apos;Honda City&apos;], &apos;Ratings&apos;:[9.0, 8.0, 5.0, 3.0]} # Here, we are creating the pandas DataFrame. df = pd.DataFrame(data, index =[&apos;position1&apos;, &apos;position2&apos;, &apos;position3&apos;, &apos;position4&apos;]) # Here, we are printing the data print(df)

出力：

 Name Ratings position1 Renault 9.0 position2 Duster 8.0 position3 Maruti 5.0 position4 Honda City 3.0

説明：

Import Pandas: import pandas as pd は、Pandas ライブラリをインポートし、それを pd と名付けます。
辞書の作成: 情報は単語参照であり、キーはセグメント名 (「名前」と「評価」) であり、値は関連情報を含むレコードです。
データフレームの開発: pd.DataFrame(data,index=['position1', 'position2', 'position3', 'position4']) は、単語参照から DataFrame を構築します。事前定義されたリストが行に割り当てられます。
DataFrame の印刷: print(df) は後続の DataFrame を印刷します。

方法 - 5: dict のリストからデータフレームを作成する

Pandas データフレームを作成するための入力データとして辞書のリストを渡すことができます。デフォルトでは、列名がキーとして使用されます。次の例を理解してみましょう。

例 -

 # Here, we are implementing an example to create # Pandas DataFrame by using the lists of dicts. import pandas as pd # Here, we are importing the pandas library as pd # Here, we are assigning the values to lists. data = [{&apos;A&apos;: 10, &apos;B&apos;: 20, &apos;C&apos;:30}, {&apos;x&apos;:100, &apos;y&apos;: 200, &apos;z&apos;: 300}] # Here, we are creating the DataFrame. df = pd.DataFrame(data) # Here, we are printing the data of the dataframe print(df)

出力：

 A B C x y z 0 10.0 20.0 30.0 NaN NaN NaN 1 NaN NaN NaN 100.0 200.0 300.0

行インデックスと列インデックスの両方を持つ辞書のリストから pandas データフレームを作成する別の例を理解してみましょう。

説明：

Import Pandas: import pandas as pd は、Pandas ライブラリをインポートし、それを pd と名付けます。
リストと辞書の作成: 情報は概要であり、すべてのコンポーネントは DataFrame 内の列をアドレス指定する単語参照です。単語参照のキーがセグメント名になります。
データフレームの開発: pd.DataFrame(data) は、単語参照の要約からデータフレームを構築します。単語参照のキーがセクションになり、品質が DataFrame 内の情報になります。
DataFrame の印刷: print(df) は後続の DataFrame を印刷します。

例 - 2:

 # Here, we are importing the pandas library as pd import pandas as pd # Here, we are assigning the values to the lists. data = [{&apos;x&apos;: 1, &apos;y&apos;: 2}, {&apos;A&apos;: 15, &apos;B&apos;: 17, &apos;C&apos;: 19}] # Here, we are declaring the two column indices, values same as the dictionary keys dframe1 = pd.DataFrame(data, index =[&apos;first&apos;, &apos;second&apos;], columns =[&apos;x&apos;, &apos;y&apos;]) # Here, we are declaring the variable dframe1 with the parameters data and the indexes # Here, we are declaring the two column indices with # one index with other name dframe2 = pd.DataFrame(data, index =[&apos;first&apos;, &apos;second&apos;], columns =[&apos;x&apos;, &apos;y1&apos;]) # Here, we are declaring the variable dframe2 with the parameters data and the indexes # Here, we are printing the first data frame i.e., dframe1 print (dframe1, &apos;
&apos;) # Here, we are printing the first data frame i.e., dframe2 print (dframe2)

出力：

 x y first 1.0 2.0 second NaN NaN x y1 first 1.0 NaN second NaN NaN

説明：

pandas ライブラリは、情報という名前の単語参照の要約から始まる、dframe1 と dframe2 という 2 つの紛れもない DataFrame を作成するために利用されます。これらの単語参照は、DataFrame 内の個々の行の描写として機能します。キーはセグメント名に関連し、関連する品質は関連情報に対応します。基礎となるデータフレーム dframe1 は、明示的なラインファイル ('first' と 'second') およびセクションレコード ('x' と 'y') を使用して起動されます。したがって、2 番目のデータフレーム dframe2 は、同様の情報コレクションを使用して作成されますが、セクションファイルには差異があり、明示的に「x」と「y1」として示されます。このコードは、両方の DataFrame をコントロールセンターに出力して終了し、各 DataFrame の特定のセクションの設計を明確にします。このコードは、パンダライブラリ内の DataFrame の作成と制御の広範な概要として機能し、さまざまなセクションレコードを実行する方法についての体験を提供します。

例 - 3

 # The example is to create # Pandas DataFrame by passing lists of # Dictionaries and row indices. import pandas as pd # Here, we are importing the pandas library as pd # assign values to lists data = [{&apos;x&apos;: 2, &apos;z&apos;:3}, {&apos;x&apos;: 10, &apos;y&apos;: 20, &apos;z&apos;: 30}] # Creates padas DataFrame by passing # Lists of dictionaries and row index. dframe = pd.DataFrame(data, index =[&apos;first&apos;, &apos;second&apos;]) # Print the dataframe print(dframe)

出力：

 x y z first 2 NaN 3 second 10 20.0 30

説明：

この Python コードでは、単語参照の配置と列レコードの決定により、pandas ライブラリを利用して Pandas DataFrame が開発されます。このサイクルは、簡潔にするために偽名「pd」が割り当てられた pandas ライブラリのインポートから始まります。したがって、情報と名付けられた単語参照の概要が特徴付けられ、すべての単語参照がデータフレームの行をアドレス指定します。これらの単語参照内のキーはセグメント名を意味し、関連する値は重要な情報を示します。

dframe として示される DataFrame は、pd.DataFrame() コンストラクターを利用して作成され、指定された情報を統合し、行レコードを「first」と「next」に明示的に設定します。後続の DataFrame には、「x」、「y」、「z」という名前のセクションを含む均等なデザインが表示されます。欠落している品質は「NaN」として示されます。

方法 - 6: zip() 関数を使用してデータフレームを作成する

zip() 関数は、2 つのリストをマージするために使用されます。次の例を理解してみましょう。

例 -

 # The example is to create # pandas dataframe from lists using zip. import pandas as pd # Here, we are importing the pandas library as pd # List1 Name = [&apos;tom&apos;, &apos;krish&apos;, &apos;arun&apos;, &apos;juli&apos;] # List2 Marks = [95, 63, 54, 47] # two lists. # and merge them by using zip(). list_tuples = list(zip(Name, Marks)) # Assign data to tuples. print(list_tuples) # Converting lists of tuples into # pandas Dataframe. dframe = pd.DataFrame(list_tuples, columns=[&apos;Name&apos;, &apos;Marks&apos;]) # Print data. print(dframe)

出力：

 [(&apos;john&apos;, 95), (&apos;krish&apos;, 63), (&apos;arun&apos;, 54), (&apos;juli&apos;, 47)] Name Marks 0 john 95 1 krish 63 2 arun 54 3 juli 47

説明：

この Python コードは、pandas ライブラリと圧縮機能を利用して、2 つのレコード、具体的には「Name」と「Stamps」から Pandas DataFrame を生成する方法を示しています。 pandas ライブラリのインポートに続いて、「Name」レコードと「Checks」レコードが特徴付けられ、DataFrame の理想的なセクションに対応します。 zip 機能を利用して、これらのランダウンの比較コンポーネントをタプルに結合し、list_tuples という名前の別のランダウンを構成します。

コードはその時点でタプルの概要を出力し、結合された情報を簡単に確認します。その結果、pd.DataFrame() コンストラクターを利用して dframe という名前の Pandas DataFrame が作成され、タプルのランダウンが組織化された均等な構成に変更されます。セグメント「名前」と「スタンプ」は、このデータフレーム作成プロセス中に明確に割り当てられます。

方法 - 7: シリーズの辞書からデータフレームを作成する

ディクショナリを渡してデータフレームを作成できます。後続のインデックスが、渡されたインデックス値のすべてのシリーズの和集合であるシリーズの辞書を使用できます。次の例を理解してみましょう。

例 -

 # Pandas Dataframe from Dicts of series. import pandas as pd # Here, we are importing the pandas library as pd # Initialize data to Dicts of series. d = {&apos;Electronics&apos; : pd.Series([97, 56, 87, 45], index =[&apos;John&apos;, &apos;Abhinay&apos;, &apos;Peter&apos;, &apos;Andrew&apos;]), &apos;Civil&apos; : pd.Series([97, 88, 44, 96], index =[&apos;John&apos;, &apos;Abhinay&apos;, &apos;Peter&apos;, &apos;Andrew&apos;])} # creates Dataframe. dframe = pd.DataFrame(d) # print the data. print(dframe)

出力：

 Electronics Civil John 97 97 Abhinay 56 88 Peter 87 44 Andrew 45 96

説明：

この Python コードでは、pandas ライブラリを利用してシリーズの単語参照から Pandas DataFrame が作成されます。「ガジェット」と「共通」の 2 つの主題はセクションとして扱われ、明示的なファイルを含む個々のスコアは dframe という名前のデータフレームに統合されます。その後の単純な構造がコントロールセンターに出力され、パンダを利用してマークされた情報を調整および調査するためのコンパクトなテクニックが示されます。

このチュートリアルでは、DataFrame を作成するさまざまな方法について説明しました。