3 Python list, dictionary, collection, and JSON data types

Keywords: Python TensorFlow Deep Learning

Applications of Deep Neural Networks with Keras

Application of Deep Neural Network Based on Keras

Translated by Jeff Heaton: The Academic Frontier of Artificial Intelligence

Catalog

1.Python Foundation

2. Machine learning Python

3. Introduction to TensorFlow

4. Training of tabular data

5. Regularization and Dropout

6. Convolutional Neural Network for Computer Vision

7. Generate antagonistic networks

8.Kaggle Dataset

9. Migrating Learning

10.Keras Time Series

11. Natural Language Processing and Speech Recognition

12. Enhanced Learning

13.Advanced/Other Topics

14. Other Neural Network Technologies

1.3 Python list, dictionary, collection, and JSON data types

Part 1.3: Python Lists, Dictionaries, Sets and JSON

python data structure

data structures

Like most modern programming languages, Python includes lists, collections, dictionaries, and other data structures as built-in types. The syntax appearance of both is similar to JSON. Compatibility between Python and JSON will be discussed later in this module. This course will focus on lists, collections, and dictionaries. It is important to understand the differences between these three basic collection types.

Dictionary - A dictionary is a variable, disordered collection that Python indexes with name and value pairs.
List - A list is a variable, ordered collection of repetitive elements.
Set - A collection is a variable, disordered collection with no repeating elements.
Tuple - A tuple is an immutable, ordered collection that allows duplicate elements

Variable and Invariant Types

mutable collection and immutable collection

Most Python data structures are mutable, meaning that programs can add and delete elements after definition. Inmutable data structures cannot add or delete items after definition. It is also important to understand that ordered collections mean that items maintain their order when programs add them to a collection. This order may not be any specific order, such as letters orNumber.

Lists and tuples are very similar in Python and are often confused. The significant difference is that lists are variable, while tuples are not. So when we want to include similar items, we include a list, and when we know in advance what information will be included, we include a tuple.

Many programming languages contain a data collection called arrays. Clearly, there is no array type in Python. Typically, programmers use lists instead of arrays in Python. Arrays in most programming languages are fixed-length, requiring the program to know the maximum number of elements required beforehand. This limitation leads to notorious array overflow errors and security issuesPython lists are much more flexible because programs can dynamically change the size of lists.

Lists & Tuples

Lists and Tuples

For Python programs, lists and tuples are very similar. As programmers, it is possible to use lists only and ignore tuples. Both lists and tuples hold an ordered collection of items.

The main syntax difference you'll see is that lists are enclosed in square brackets [], and tuples are enclosed in brackets (). The code below defines both lists and tuples.

l = ['a', 'b', 'c', 'd']
t = ('a', 'b', 'c', 'd')


print(l)
print(t)

Snippet: Switchable language, unable to format text separately

output

['a', 'b', 'c', 'd']

('a', 'b', 'c', 'd')

The main difference you'll see programmatically is that the list is mutable, which means the program can change it. Tuples are mutable, which means the program cannot change it. The code below demonstrates that the program can change the list. This code also shows that the Python index starts at element 0. Accessing element 1 modifies the second element in the collection relative to the column.One advantage of tables is that tuples tend to iterate faster than lists.

l[1] = 'changed'
#t[1] = 'changed' # This would result in an error


print(l)

output

['a', 'changed', 'c', 'd']

for loop statement to access list elements

for-each

Like many languages, Python also has for-each statements, which allow looping through each element in a collection, such as a list or tuple.

# Iterate over a collection.
for s in l:
    print(s)

output

a

changed

c

d

enumerate function

enumerate

The enumerate function is useful for enumerating collections and accessing the index of the current element

# Iterate over a collection, and know where your index.  (Python is zero-based!)
for i,l in enumerate(l):
    print(f"{i}:{l}")

output

0 : a

1 : changed

2 : c

3 : d

List Operation

list

Lists can add multiple objects, such as strings. Duplicate values are allowed. Tuples do not allow programs to add other objects after definition.

 Manually add i tems , l i s t s a l l o w d u p l i c a t e s
c = [ ]
c . append ( ' a ' )
c . append ( 'b ' )
c . append ( ' c ' )
c . append ( ' c ' )
print ( c )

Snippet: Switchable language, unable to format text separately

output

[ ' a ' , 'b ' , ' c ' , ' c ' ]

Ordered collections, such as lists and tuples, allow you to access elements by index numbers, as shown in the code below. Unordered collections, such as dictionaries and collections, do not allow programs to access them in this way.

print ( c [ 1 ] )

Snippet: Switchable language, unable to format text separately

output

b

Lists can add multiple objects, such as strings. Duplicate values are allowed. Tuples do not allow programs to add other objects after they have been defined. For insert functions, the programmer must specify an index. These operations on tuples are not allowed because they cause changes.

# I n s e r t
c = [ ' a ' , 'b ' , ' c ' ]
c . i n s e r t ( 0 , ' a0 ' )
print ( c )
# Remove
c . remove ( 'b ' )
print ( c )
# Remove a t i n d e x
del c [ 0 ]
print ( c )

output

[ ' a0 ' , ' a ' , 'b ' , ' c ' ]

[ ' a0 ' , ' a ' , ' c ' ]

[ ' a ' , ' c ' ]

aggregate

Sets

Python collections contain an unordered collection of objects, but collections do not allow duplication. If a program adds duplicates to a collection, only one copy of each item remains in the collection. Adding duplicates to the collection does not result in an error. Any of the following techniques can define a collection.

s = set ( )
s = { ' a ' , 'b ' , ' c ' }
s = set ( [ ' a ' , 'b ' , ' c ' ] )
print ( s )

output

{ ' c ' , ' a ' , 'b ' }

Lists are always bracketed [], tuples are bracketed (), and now we see programmers bracket a collection. Programs can add items to a collection at run time. Programs can dynamically add items to a collection using the add function. It is important to note that the append function adds items to lists and tuples, and the add function adds items to the collection.

# Manually add i tems , s e t s do no t a l l o w d u p l i c a t e s
# S e t s add , l i s t s append . I  find  this annoying 
c = set ( )
c . add ( 'a' )
c . add ( 'b' )
c . add ( 'c' )
c . add ( 'c' )
print ( c )

output

{ ' c ' , ' a ' , 'b ' }

Maps/Dictionaries/Hash Tables

Maps/Dictionaries/Hash Tables

Many programming languages contain concepts of mapping, dictionary, or hash table. These are very relevant concepts. Python provides a dictionary, which is essentially a collection of name-value pairs. Programs use curly brackets to define dictionaries, as shown below.

d = { 'name ' : " J e f f " , ' a d d r e s s ' : " 123 ␣Main " }
print ( d )
print ( d [ 'name ' ] )
i f 'name ' in d :
print ( "Name␣ i s ␣ d e fi n e d " )
i f ' age ' in d :
print ( " age ␣ d e fi n e d " )
e l s e :
print ( " age ␣ u n d e fi n e d " )

output

{ 'name ' : ' J e f f ' , ' add re s s ' : '1 2 3 Main ' }

J e f f

Name i s d e fi n e d

age u n d e fi n e d

Be careful not to attempt to access undefined keys, as this can lead to errors. You can check to see if a key is defined, as shown above. You can also access the directory and provide default values, as shown in the code below.

d . g e t ( ' unknown_key ' , ' d e f a u l t ' )

output

' d e f a ul t '

You can also access the keys and values of a dictionary

d = { 'name ' : " J e f f " , ' a d d r e s s ' : " 123 ␣Main " }
# A l l o f t h e k ey s
print ( f "Key : ␣ {d . key s ( ) } " )
# A l l o f t h e v a l u e s
print ( f " Values : ␣ {d . v al u e s ( ) } " )

output

Key : dic t_ ke y s ( [ ' name ' , ' add re s s ' ] )

Values : di c t_ v al u e s ( [ ' J e f f ' , '1 2 3 Main ' ] )

Dictionaries and lists can be combined. This syntax is closely related to JSON. Dictionaries and lists are a good way to build very complex data structures. Python allows strings to use quotes (") and apostrophes ('), while JSON only allows double quotes ("). We will discuss JSON in more detail in a later module.

The code below shows a mix of dictionary and list usage.

# Python l i s t & map s t r u c t u r e s
cu s t ome r s = [
{ " name " : " J e f f ␣&␣Tracy ␣Heaton " , " p e t s " : [ "Wynton " , " C ri c k e t " ,
" Hickory " ] } ,
{ " name " : " John ␣ Smith " , " p e t s " : [ " r o v e r " ] } ,
{ " name " : " Jane ␣Doe " }
]
print ( cu s t ome r s )
for customer in cu s t ome r s :
print ( f " { customer [ ' name ' ] } : { customer . g e t ( ' p e t s ' , ␣ ' no␣ p e t s ') } " )

output

[ { ' name ' : ' J e f f & Tracy Heaton ' , ' pe t s ' : [ 'Wynton ' , ' C ri c k e t ' ,' Hickory ' ] } , { 'name ' : ' John Smith ' , ' pe t s ' : [ ' r o ve r ' ] } , { 'name ' : ' JaneDoe ' } ]

J e f f & Tracy Heaton : [ ' Wynton ' , ' C ri c k e t ' , ' Hickory ' ]

John Smith : [ ' r o ve r ' ]

Jane Doe : no p e t s

The variable customer is a list of three dictionaries representing customers. You can think of these dictionaries as records in a table. The fields in these individual records are the keys of the dictionary.
Key names and pets are fields here. However, field pets have a list of pet names. Nested lists and maps are unlimited in depth. You can also nest maps in maps or nest lists in another list.

More advanced list

More Advanced Lists

The list described in this section has several advanced features. zip is such a function. Two lists can be combined into a list by using a zip command. The following code demonstrates the zip command.

a = [1,2,3,4,5]
b = [5,4,3,2,1]


print(zip(a,b))

output

<zip object at 0x00000246ea2f1d40>

To see the results of the zip function, we convert the returned zip object into a list. As you can see, the zip function returns a list of tuples. Each tuple represents a pair of items that the function is compressed together. The order of the two lists is maintained.

a = [1,2,3,4,5]
b = [5,4,3,2,1]


print(list(zip(a,b)))

output

[(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)]

The usual way to use the zip command is in a for loop. The code below shows how a for loop assigns variables to each collection the program is iterating over.

a = [1,2,3,4,5]
b = [5,4,3,2,1]


for x,y in zip(a,b):
    print(f'{x} - {y}')

output

1 - 5

2 - 4

3 - 3

4 - 2

5 - 1

Typically, when passed to a zip command, the two collections are the same length. Having a collection of different lengths is not an error. As shown in the code below, a zip command only handles elements that are smaller than the length of a smaller collection.

a = [1,2,3,4,5]
b = [5,4,3]


print(list(zip(a,b)))

output

[(1, 5), (2, 4), (3, 3)]

Sometimes, when a for loop traverses an ordered collection, you may want to know the current numeric index. Use the enumerate command to track the index position of the collection elements. Because the enumerate command processes the numeric index of the collection, the zip command assigns any index to the elements in the unordered collection.

Consider how to construct a Python program that modifies each element greater than 5 to 5. The following program performs this conversion. The enumerate command allows a loop to know which element index it is currently in, thereby allowing the program to change the value of the current element in the collection.

a = [2, 10, 3, 11, 10, 3, 2, 1]
for i, x in enumerate(a):
    if x>5:
        a[i] = 5
print(a)

output

[2, 5, 3, 5, 5, 3, 2, 1]

The list generation command dynamically builds a list. The following generations are: from 0 to 9 and add each value (multiplied by 10) to a list.

lst = [x*10 for x in range(10)]
print(lst)

output

[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]

A dictionary can also be a form of generation. The general format is:

dict_variable = {key:value for (key,value) in dictonary.items()}

One common use of this is to create an index of the symbol column names.

text = ['col-zero','col-one', 'col-two', 'col-three']
lookup = {key:value for (value,key) in enumerate(text)}
print(lookup)

output

{'col-zero': 0, 'col-one': 1, 'col-two': 2, 'col-three': 3}

This can be used to easily find the index of a column by name.

print(f'The index of "col-two" is {lookup["col-two"]}')

output

The index of "col-two" is 2

Introduction to JSON

An Introduction to JSON

The data stored in the CSV file must be flat and regular;That is, it must be suitable for rows and columns. Most people refer to this type of data as structured data or tabular data. This data is tabular because the number of columns in each row is the same. A single row may lack column values;However, these rows still have the same columns.

This type of data is convenient for machine learning because most models, such as neural networks, also expect incoming data to be fixed-dimensional. Real-world information is not always so tabular. Consider whether these rows represent customers. These people may have multiple phone numbers and addresses. How do you use a fixed number of columns to describe such data? List them in each rowA list of courses or students is useful, and each line can be of variable length.

JavaScript Object Representation (JSON) is a standard file format that stores data in a hierarchical format similar to Extensible Markup Language (XML). JSON is simply a hierarchy of lists and dictionaries. Programmers call this data semi-structured or hierarchical. Here is an example JSON file.

{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": null
}

The file above looks a bit like Python code. You can see the curly brackets defining the dictionary and the square brackets defining the list. JSON requires only one root element. A list or dictionary can accomplish this role. JSON requires double quotation marks to enclose strings and names. Single quotation marks are not allowed in JSON.

JSON files are always legal JavaScript syntax. JSON is usually as valid as Python code, as shown in the Python program below.

jsonHardCoded = {
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": True,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    },
    {
      "type": "mobile",
      "number": "123 456-7890"
    }
  ],
  "children": [],
  "spouse": None
}

Typically, reading JSON from a file, string, or the Internet is better than hard coding, as shown here. However, this hard coding is sometimes useful for internal data structures.

Python contains support for JSON. When a Python program loads JSON, it returns a root list or dictionary, as shown in the code below.

import json


json_string = '{"first":"Jeff","last":"Heaton"}'
obj = json.loads(json_string)
print(f"First name: {obj['first']}")
print(f"Last name: {obj['last']}")

output

First name: Jeff

Last name: Heaton

Python programs can also load JSON from files or URL s.

import requests


r = requests.get("https://raw.githubusercontent.com/jeffheaton/"
                 +"t81_558_deep_learning/master/person.json")
print(r.json())

output

{'firstName': 'John', 'lastName': 'Smith', 'isAlive': True, 'age': 27, 'address': {'streetAddress': '21 2nd Street', 'city': 'New York', 'state': 'NY', 'postalCode': '10021-3100'}, 'phoneNumbers': [{'type': 'home', 'number': '212 555-1234'}, {'type': 'office', 'number': '646 555-4567'}, {'type': 'mobile', 'number': '123 456-7890'}], 'children': [], 'spouse': None}

Python programs can easily generate JSON strings from Python objects in dictionaries and lists.

python_obj = {"first":"Jeff","last":"Heaton"}
print(json.dumps(python_obj))

output

{"first": "Jeff", "last": "Heaton"}

Data scientists often encounter JSON when accessing web services to obtain data. Data scientists can use the techniques described in this section to convert semi-structured JSON data into tabular data for programs to use with models such as neural networks.

Posted by patheticsam on Fri, 08 Oct 2021 09:04:29 -0700